Reputation: 733
I want to extract some values for particular keys from a table with json string as below.
raw_data | ... |
---|---|
{"label": "XXX", "lines":[{"amount":1000, "category": "A"}, {"amount":100, "category": "B"}, {"amount":10, "category": "C"}]} | ... |
I am expecting an outcome like
label | amount | category |
---|---|---|
XXX | [1000, 100, 10] | ['A', 'B', 'C'] |
I am using the following sql query to achieve that
select
JSON_EXTRACT(raw_data, '$.lines[*].amount') AS amount,
JSON_EXTRACT(raw_data, '$.lines[*].category') AS category,
JSON_EXTRACT(raw_data, '$.label') AS label
from table
I can get a specific element of the list with [0]
, [1]
etc. But the sql code doesn't work with [*]
. I am getting the following error
Invalid JSON path: '$.lines[*].amount'
Edit
I am using Presto
Upvotes: 1
Views: 1372
Reputation: 142173
Json path support in Presto is very limited, so you need to do some processing manually for example with casts and array functions:
-- sample data
with dataset (raw_data) as (
values '{"label": "XXX", "lines":[{"amount":1000, "category": "A"}, {"amount":100, "category": "B"}, {"amount":10, "category": "C"}]}'
)
-- query
select label,
transform(lines, l -> l['amount']) amount,
transform(lines, l -> l['category']) category
from (
select JSON_EXTRACT(raw_data, '$.label') AS label,
cast(JSON_EXTRACT(raw_data, '$.lines') as array(map(varchar, json))) lines
from dataset
);
Output:
label | amount | category |
---|---|---|
XXX | [1000, 100, 10] | ["A", "B", "C"] |
In Trino json path support was vastly improved, so you can do next:
-- query
select JSON_EXTRACT(raw_data, '$.label') label,
JSON_QUERY(raw_data, 'lax $.lines[*].amount' WITH ARRAY WRAPPER) amount,
JSON_QUERY(raw_data, 'lax $.lines[*].category' WITH ARRAY WRAPPER) category
from dataset;
Upvotes: 1
Reputation: 5124
I was able to get the expected output using unnest
to flatten and array_agg
to aggregate in Presto
. Below is the SQL used and output generated:
WITH dataset AS (
SELECT
* from sf_73535794
)
SELECT raw_data.label,array_agg(t.lines.amount) as amount,array_agg(t.lines.category) as category FROM dataset
CROSS JOIN UNNEST(raw_data.lines) as t(lines) group by 1
Output:
Upvotes: 0
Reputation: 71451
You can use json_table
and json_arrayagg
:
select json_extract(t.raw_data, '$.label'),
(select json_arrayagg(t1.v) from json_table(t.raw_data, '$.lines[*]' columns (v int path '$.amount')) t1),
(select json_arrayagg(t1.v) from json_table(t.raw_data, '$.lines[*]' columns (v text path '$.category')) t1)
from tbl t
Upvotes: 0