user304611
user304611

Reputation: 337

Array structures querying in presto, hive

col-1 has dep_id(varchar) -

112 

col-2 has array struct

 [
  {
    "emp_id": 8291828,
    "name": "bruce",
  },
  {
    "emp_id": 8291823,
    "name": "Rolli",
  }
]

I have a use case where i need to flatten and display results. For example when queried data for dep_id - 112 I need to display emp_id in a separate row. For above data when queried my result should look like

id  emp_id
112 8291828
112 8291823

What should be my query format to fetch data?

Upvotes: 0

Views: 7838

Answers (1)

Dain Sundstrom
Dain Sundstrom

Reputation: 2858

There are several parts to make this work. First the JSON data will appear as a VARCHAR, so you first need to run json_parse on it to convert it to a JSON type in the engine. Then you can cast JSON types to normal SQL structural types, and in your case this is an array of rows (see cast from JSON). Finally, you do a cross join to the array of rows (which is effectively a nested table). This query fill give you the results you want

WITH your_table AS (
    SELECT
        112 AS dep_id
        , '[{"emp_id": 8291828, "name": "bruce"}, {"emp_id": 8291823, "name": "Rolli"}]' AS data
)
SELECT
    dep_id
    , r.emp_id
    , r.name
FROM your_table
CROSS JOIN
    UNNEST(cast(json_parse(data) as array(row (emp_id bigint, name varchar)))) nested_data(r)    

Upvotes: 1

Related Questions