PatientOutcomes
PatientOutcomes

Reputation: 9

Standard SQL - multiple rows from an ARRAY of STRUCT

I have a JSON structure similar to the following:

{"Name": "Value 1",
 "Colors": {
   {"Basic colors": {
      "Primary colors": {
         [{"First Color":"Blue", "Second Color": "Red"}]
      }
    }
  }
}

Using JSON_EXTRACT(Name.Colors, '$.Basic_colors[0].Primary_colors)I can return the array of a struct, and then extract the values from the struct. However, there can be multiple items in "Primary Color" such as:

[{"First Color":"Blue", "Second Color": "Red"},{"First Color":"Green", "Second Color": "Orange"}]

Ideal solution:

Name    |   First Color  |  Second Color  

Value 1 |   Blue         |  Red

Value 1 |   Green        |  Orange

The problem I've had is that, when using JSON_EXTRACT, "Primary Colors" isn't recognized as an array or struct. It is seen as a string (Makes sense, but also isn't able to be CAST() - so can't UNNEST().

The second problem I've run into is that, while I can index to [0] element in array while using JSON_extract, I can't loop for each element generating a new row of data with all child elements.

I can't use FNSPLIT, as it isn't supported in standard SQL. I believe I'd like to UNNEST(), but I can't figure out how to transform the STRUCT to be recognized as the first element in the array. My python-saturated brain wants to loop through: for item in range(len(json_array)):

I can't use Python for this (in production) and there must be an easier way than writing a loop in a SQL macro?

Upvotes: 1

Views: 1097

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172954

May 1st, 2020 Update

A new function, JSON_EXTRACT_ARRAY, has been just added to the list of JSON functions. This function allows you to extract the contents of a JSON document as a string array.

so in below you can replace use of json2array UDF with just in-built function JSON_EXTRACT_ARRAY as in below example

#standardSQL
SELECT 
  JSON_EXTRACT_SCALAR(json,'$.Name') Name, 
  JSON_EXTRACT_SCALAR(item, '$.First_Color') First_Color,
  JSON_EXTRACT_SCALAR(item, '$.Second_Color') Second_Color
FROM `project.dataset.table`, 
UNNEST(JSON_EXTRACT_ARRAY(json,'$.Colors.Basic_colors.Primary_colors')) item

==============

Below example for BigQuery Standard SQL

#standardSQL
CREATE TEMP FUNCTION JsonToArray(input STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
  return JSON.parse(input).map(x=>JSON.stringify(x));
"""; 
WITH `project.dataset.table` AS (
  SELECT '''{
  "Name": "Value 1",
  "Colors": {
    "Basic_colors": {"Primary_colors": [
      {"First_Color":"Blue", "Second_Color": "Red"},
      {"First_Color":"Green", "Second_Color": "Orange"}
     ]
   }
  }}''' json
)
SELECT 
  JSON_EXTRACT_SCALAR(json,'$.Name') Name, 
  JSON_EXTRACT_SCALAR(item, '$.First_Color') First_Color,
  JSON_EXTRACT_SCALAR(item, '$.Second_Color') Second_Color
FROM `project.dataset.table`, 
UNNEST(JsonToArray(JSON_EXTRACT(json,'$.Colors.Basic_colors.Primary_colors'))) item

with output

Row Name    First_Color Second_Color     
1   Value 1 Blue        Red  
2   Value 1 Green       Orange

Upvotes: 3

Related Questions