How to use BigQuery SQL syntax to get value as the key in an JSON output

Question

I have a table similar to the following:

Region	Country	Exports
North America	US	Tech
North America	US	Vehicles
North America	US	Food
North America	Canada	Tech
North America	Canada	Food
North America	Canada	Tourism

I am able to get an aggregated output using the following syntax:

select region, struct(country, array_agg(distinct(exports))) from exports_table group by region, country

which looks like:

[
  {
    "f0_": {
      "region": "North America",
      "_field_2": {
        "country": "US",
        "_field_2": [
          "Tech",
          "Vehicles",
          "Food",
          ]
      }
    }
  },
  {
    "f0_": {
      "region": "North America",
      "_field_2": {
        "country": "Canada",
        "_field_2": [
          "Tech",
          "Food",
          "Tourism",
          ]
      }
    }
  },

]

How can I translate this structure 1. to use the fields as keys and 2. better aggregation (as below)? (This is something doable on MySQL using JSON_OBJECTAGG and JSON_ARRAYAGG)

[
    {
        "North America": {
            "US": ["Tech","Vehicles","Food"],
            "Canada": ["Tech","Vehicles","Food"]
        }

    }
]

Mikhail Berlyant · Accepted Answer

Consider below approach

select format('[{"%s": {%s}}]', Region, string_agg(CountryExports)) as output
from (
  select Region, 
    format('"%s": %s', Country, format('%T', array_agg(Exports))) CountryExports
  from exports_table
  group by Region, Country
)
group by Region

if applied to sample data in your question - output is

Should be easy to apply above approach to whatever your real use case is

How to use BigQuery SQL syntax to get value as the key in an JSON output

Answers (1)

Related Questions