json sql dailybitalks.com

How to Handle JSON Data in SQL Databases (MySQL, SQL Server, Hive)

JSON (JavaScript Object Notation) has become the universal language for exchanging data between applications. Whether you’re pulling data from APIs, storing logs, or dealing with semi-structured data in a data lake — JSON is everywhere.

But here’s the challenge: most SQL databases were built for structured data, not nested JSON.

So how do you read, store, and query JSON efficiently in different SQL environments?

In this post, we’ll go through how to handle JSON data across popular databases — MySQL, SQL Server, Hive SQL, and PostgreSQL — including how to:

  • Convert multiple columns into JSON
  • Parse JSON fields
  • Extract nested values
  • Flatten JSON into columns

Let’s break it down database by database.


What Is JSON in SQL Context?

JSON is a lightweight data-interchange format that looks like this:

{
  "CustomerID": 101,
  "Name": "Alice",
  "Region": "East",
  "Orders": [
    { "OrderID": 1, "Amount": 250 },
    { "OrderID": 2, "Amount": 400 }
  ]
}

It’s great for flexible, schema-less storage.
But querying JSON in SQL requires special functions, since it’s essentially a text field with structure.


MySQL: The Most JSON-Friendly of the Traditional SQLs

Since MySQL 5.7, JSON has been a native data type, which means it understands JSON structure internally and provides dedicated functions to manipulate it.

Create and Store JSON

CREATE TABLE Customers (
    CustomerID INT,
    CustomerData JSON
);

Insert JSON directly:

INSERT INTO Customers VALUES 
(1, '{"Name": "Alice", "Region": "East", "Orders": [100, 200, 300]}');

Parse JSON

Use JSON_EXTRACT() or shorthand -> operator:

SELECT 
    JSON_EXTRACT(CustomerData, '$.Name') AS Name,
    JSON_EXTRACT(CustomerData, '$.Orders[0]') AS FirstOrder
FROM Customers;

Or the shorter syntax:

SELECT 
    CustomerData->'$.Name' AS Name,
    CustomerData->'$.Orders[1]' AS SecondOrder
FROM Customers;

Convert Multiple Columns to JSON

You can build JSON objects directly from columns:

SELECT 
    JSON_OBJECT('CustomerID', CustomerID, 'Region', Region, 'Sales', Sales) AS CustomerJSON
FROM Sales;

Or if you want an array of JSON objects:

SELECT 
    JSON_ARRAYAGG(JSON_OBJECT('OrderID', OrderID, 'Amount', Amount)) AS Orders
FROM Orders
GROUP BY CustomerID;

Tip: MySQL automatically validates JSON syntax when you use the JSON data type — it won’t accept malformed data.


SQL Server: Powerful JSON Functions (No Native JSON Type)

SQL Server doesn’t have a dedicated JSON type — but it treats JSON as NVARCHAR and provides built-in functions to handle it.

Convert Columns to JSON

Use FOR JSON to create JSON objects from query results.

SELECT CustomerID, Name, Region
FROM Customers
FOR JSON AUTO;

Result:

[{"CustomerID":1,"Name":"Alice","Region":"East"},
 {"CustomerID":2,"Name":"Bob","Region":"West"}]

To control structure, use FOR JSON PATH:

SELECT 
  CustomerID AS "Customer.CustomerID",
  Name AS "Customer.Name",
  Region AS "Customer.Region"
FROM Customers
FOR JSON PATH;

This gives nested JSON under "Customer".


Parse JSON

Use OPENJSON() to convert JSON text into rows and columns.

DECLARE @data NVARCHAR(MAX) = 
'{"CustomerID":1, "Orders":[{"OrderID":10,"Amount":200},{"OrderID":11,"Amount":400}]}';

SELECT * 
FROM OPENJSON(@data, '$.Orders')
WITH (
    OrderID INT '$.OrderID',
    Amount FLOAT '$.Amount'
);

Result:

OrderIDAmount
10200
11400

JSON Value Extraction

You can also grab specific fields using JSON_VALUE() and JSON_QUERY():

SELECT 
    JSON_VALUE(@data, '$.CustomerID') AS CustomerID,
    JSON_QUERY(@data, '$.Orders') AS OrdersArray;

PostgreSQL: json vs jsonb (Fast and Flexible)

PostgreSQL was one of the first relational databases to truly embrace JSON, with two key types:

  • json – stored as plain text (slower)
  • jsonb – stored in binary format (faster, indexable)

Store JSON

CREATE TABLE customers (
  id SERIAL PRIMARY KEY,
  data JSONB
);

Insert JSON data:

INSERT INTO customers (data) 
VALUES ('{"name": "Alice", "region": "East", "orders": [100,200,300]}');

Query JSON Fields

Use PostgreSQL’s powerful -> and ->> operators:

SELECT 
  data->>'name' AS name,
  data->'orders'->>0 AS first_order
FROM customers;
  • -> returns JSON
  • ->> returns text

Convert Columns into JSON

SELECT 
  to_jsonb(row_to_json(row(CustomerID, Name, Region))) AS customer_json
FROM customers_info;

Or aggregate JSON objects:

SELECT 
  json_agg(json_build_object('id', id, 'name', name, 'region', region)) AS customers
FROM customers_info;

Hive SQL: JSON in Big Data Ecosystems

Apache Hive and Spark SQL are often used for semi-structured data, especially logs and event payloads stored as JSON.

Hive doesn’t have a dedicated JSON type, but you can use SerDes (Serializers/Deserializers) or built-in functions.

Parse JSON String in Hive

Using get_json_object():

SELECT 
    get_json_object(json_col, '$.name') AS name,
    get_json_object(json_col, '$.orders[0].amount') AS first_order
FROM sales_json;

Explode JSON Arrays

Use LATERAL VIEW to flatten JSON arrays:

SELECT
    a.name,
    b.order_id,
    b.amount
FROM sales_json
LATERAL VIEW json_tuple(json_col, 'name', 'orders') a AS name, orders
LATERAL VIEW explode(get_json_object(orders, '$')) b AS order_id, amount;

Convert Columns Into JSON

Use STRUCT + TO_JSON() If Available:

SELECT 
    to_json(named_struct(
        'CustomerID', CustomerID,
        'Region', Region,
        'Sales', Sales
    )) AS customer_json
FROM sales_table;

Other Databases with JSON Support

DatabaseJSON TypeKey Functions
BigQueryJSON (string or struct)JSON_EXTRACT(), PARSE_JSON()
SnowflakeVARIANTPARSE_JSON(), TO_VARIANT(), FLATTEN()
OracleJSON (as CLOB or BLOB)JSON_VALUE(), JSON_TABLE()

Example (Snowflake):

SELECT
    value:CustomerID::int AS CustomerID,
    value:Name::string AS Name
FROM TABLE(FLATTEN(input => PARSE_JSON(column_with_json)));

Practical Tips for Analysts and BI Developers

  1. Convert columns to JSON only when needed – it’s slower than relational joins for large datasets.
  2. Use JSON functions wisely – they’re CPU-intensive; avoid unnecessary parsing.
  3. Always validate JSON structure before inserting (use ISJSON() in SQL Server or json_valid() in MySQL).
  4. Leverage BI tools with native JSON parsing (Power BI, Tableau Prep, Looker) for visualization-level flexibility.
  5. Document JSON schemas for analysts — it makes data models more transparent.

Key Takeaways

  • MySQL and PostgreSQL have native JSON support and strong functions.
  • SQL Server supports JSON via NVARCHAR with great parsing tools (OPENJSON, FOR JSON).
  • Hive and Spark SQL handle JSON at scale, but require exploding and deserializing manually.
  • Always validate and format your JSON early in the pipeline.
  • Use JSON for flexibility, not as a replacement for good schema design.

Discover more from Daily BI Talks

Subscribe to get the latest posts sent to your email.