JSON (JavaScript Object Notation) has become the universal language for exchanging data between applications. Whether you’re pulling data from APIs, storing logs, or dealing with semi-structured data in a data lake — JSON is everywhere.
But here’s the challenge: most SQL databases were built for structured data, not nested JSON.
So how do you read, store, and query JSON efficiently in different SQL environments?
In this post, we’ll go through how to handle JSON data across popular databases — MySQL, SQL Server, Hive SQL, and PostgreSQL — including how to:
- Convert multiple columns into JSON
- Parse JSON fields
- Extract nested values
- Flatten JSON into columns
Let’s break it down database by database.
What Is JSON in SQL Context?
JSON is a lightweight data-interchange format that looks like this:
{
"CustomerID": 101,
"Name": "Alice",
"Region": "East",
"Orders": [
{ "OrderID": 1, "Amount": 250 },
{ "OrderID": 2, "Amount": 400 }
]
}
It’s great for flexible, schema-less storage.
But querying JSON in SQL requires special functions, since it’s essentially a text field with structure.
MySQL: The Most JSON-Friendly of the Traditional SQLs
Since MySQL 5.7, JSON has been a native data type, which means it understands JSON structure internally and provides dedicated functions to manipulate it.
Create and Store JSON
CREATE TABLE Customers (
CustomerID INT,
CustomerData JSON
);
Insert JSON directly:
INSERT INTO Customers VALUES
(1, '{"Name": "Alice", "Region": "East", "Orders": [100, 200, 300]}');
Parse JSON
Use JSON_EXTRACT() or shorthand -> operator:
SELECT
JSON_EXTRACT(CustomerData, '$.Name') AS Name,
JSON_EXTRACT(CustomerData, '$.Orders[0]') AS FirstOrder
FROM Customers;
Or the shorter syntax:
SELECT
CustomerData->'$.Name' AS Name,
CustomerData->'$.Orders[1]' AS SecondOrder
FROM Customers;
Convert Multiple Columns to JSON
You can build JSON objects directly from columns:
SELECT
JSON_OBJECT('CustomerID', CustomerID, 'Region', Region, 'Sales', Sales) AS CustomerJSON
FROM Sales;
Or if you want an array of JSON objects:
SELECT
JSON_ARRAYAGG(JSON_OBJECT('OrderID', OrderID, 'Amount', Amount)) AS Orders
FROM Orders
GROUP BY CustomerID;
Tip: MySQL automatically validates JSON syntax when you use the JSON data type — it won’t accept malformed data.
SQL Server: Powerful JSON Functions (No Native JSON Type)
SQL Server doesn’t have a dedicated JSON type — but it treats JSON as NVARCHAR and provides built-in functions to handle it.
Convert Columns to JSON
Use FOR JSON to create JSON objects from query results.
SELECT CustomerID, Name, Region
FROM Customers
FOR JSON AUTO;
Result:
[{"CustomerID":1,"Name":"Alice","Region":"East"},
{"CustomerID":2,"Name":"Bob","Region":"West"}]
To control structure, use FOR JSON PATH:
SELECT
CustomerID AS "Customer.CustomerID",
Name AS "Customer.Name",
Region AS "Customer.Region"
FROM Customers
FOR JSON PATH;
This gives nested JSON under "Customer".
Parse JSON
Use OPENJSON() to convert JSON text into rows and columns.
DECLARE @data NVARCHAR(MAX) =
'{"CustomerID":1, "Orders":[{"OrderID":10,"Amount":200},{"OrderID":11,"Amount":400}]}';
SELECT *
FROM OPENJSON(@data, '$.Orders')
WITH (
OrderID INT '$.OrderID',
Amount FLOAT '$.Amount'
);
Result:
| OrderID | Amount |
|---|---|
| 10 | 200 |
| 11 | 400 |
JSON Value Extraction
You can also grab specific fields using JSON_VALUE() and JSON_QUERY():
SELECT
JSON_VALUE(@data, '$.CustomerID') AS CustomerID,
JSON_QUERY(@data, '$.Orders') AS OrdersArray;
PostgreSQL: json vs jsonb (Fast and Flexible)
PostgreSQL was one of the first relational databases to truly embrace JSON, with two key types:
json– stored as plain text (slower)jsonb– stored in binary format (faster, indexable)
Store JSON
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
data JSONB
);
Insert JSON data:
INSERT INTO customers (data)
VALUES ('{"name": "Alice", "region": "East", "orders": [100,200,300]}');
Query JSON Fields
Use PostgreSQL’s powerful -> and ->> operators:
SELECT
data->>'name' AS name,
data->'orders'->>0 AS first_order
FROM customers;
->returns JSON->>returns text
Convert Columns into JSON
SELECT
to_jsonb(row_to_json(row(CustomerID, Name, Region))) AS customer_json
FROM customers_info;
Or aggregate JSON objects:
SELECT
json_agg(json_build_object('id', id, 'name', name, 'region', region)) AS customers
FROM customers_info;
Hive SQL: JSON in Big Data Ecosystems
Apache Hive and Spark SQL are often used for semi-structured data, especially logs and event payloads stored as JSON.
Hive doesn’t have a dedicated JSON type, but you can use SerDes (Serializers/Deserializers) or built-in functions.
Parse JSON String in Hive
Using get_json_object():
SELECT
get_json_object(json_col, '$.name') AS name,
get_json_object(json_col, '$.orders[0].amount') AS first_order
FROM sales_json;
Explode JSON Arrays
Use LATERAL VIEW to flatten JSON arrays:
SELECT
a.name,
b.order_id,
b.amount
FROM sales_json
LATERAL VIEW json_tuple(json_col, 'name', 'orders') a AS name, orders
LATERAL VIEW explode(get_json_object(orders, '$')) b AS order_id, amount;
Convert Columns Into JSON
Use STRUCT + TO_JSON() If Available:
SELECT
to_json(named_struct(
'CustomerID', CustomerID,
'Region', Region,
'Sales', Sales
)) AS customer_json
FROM sales_table;
Other Databases with JSON Support
| Database | JSON Type | Key Functions |
|---|---|---|
| BigQuery | JSON (string or struct) | JSON_EXTRACT(), PARSE_JSON() |
| Snowflake | VARIANT | PARSE_JSON(), TO_VARIANT(), FLATTEN() |
| Oracle | JSON (as CLOB or BLOB) | JSON_VALUE(), JSON_TABLE() |
Example (Snowflake):
SELECT
value:CustomerID::int AS CustomerID,
value:Name::string AS Name
FROM TABLE(FLATTEN(input => PARSE_JSON(column_with_json)));
Practical Tips for Analysts and BI Developers
- Convert columns to JSON only when needed – it’s slower than relational joins for large datasets.
- Use JSON functions wisely – they’re CPU-intensive; avoid unnecessary parsing.
- Always validate JSON structure before inserting (use
ISJSON()in SQL Server orjson_valid()in MySQL). - Leverage BI tools with native JSON parsing (Power BI, Tableau Prep, Looker) for visualization-level flexibility.
- Document JSON schemas for analysts — it makes data models more transparent.
Key Takeaways
- MySQL and PostgreSQL have native JSON support and strong functions.
- SQL Server supports JSON via NVARCHAR with great parsing tools (
OPENJSON,FOR JSON). - Hive and Spark SQL handle JSON at scale, but require exploding and deserializing manually.
- Always validate and format your JSON early in the pipeline.
- Use JSON for flexibility, not as a replacement for good schema design.
Discover more from Daily BI Talks
Subscribe to get the latest posts sent to your email.
