Koen
Koen

Reputation: 15

How to create an array of structs containing aggregated data in BigQuery?

I have a table with event data from Firebase in Google BigQuery with the following columns:

event_date, event_timestamp, event_name, (etc..).

To count the number of events by event_name I use the following query:

SELECT event_name, count(event_name) as event_count
FROM `analytics.events_20200510`
GROUP BY event_name;

which results in:

Query result

I want to convert this result to an array of structs, formatted as JSON. Example:

{
    "events": [
        {"first_open": 69},
        {"screen_view": 510},
        {"user_engagement": 354},
        {"...": ..}
    ]
}

I started with

SELECT TO_JSON_STRING(
STRUCT(
  ??? as events
), true)

but I'm stuck and don't know what should go on the ??? to make it work. Any help would be highly appreciated.


Edit:

The answer below:

SELECT TO_JSON_STRING(
STRUCT(
  (SELECT ARRAY (SELECT AS STRUCT event_name, count(*) AS count FROM `analytics.events_20200510` GROUP BY event_name)) AS events
), true);

results in

{
  "events": [
    {
      "event_name": "session_start",
      "count": 69
    },
    {
      "event_name": "screen_view",
      "count": 510
    },
    (...)
  ]
}

which is almost correct, but I would like the event names to be the 'key' and the count to be the 'value' in the struct (as in my JSON example). Would this be possible?

Upvotes: 0

Views: 574

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172954

Below is for BigQuery Standard SQL

#standardSQL
SELECT '{"events": [' || STRING_AGG(events) || ']}' AS json
FROM (
  SELECT FORMAT('{"%s": %i}', event_name, COUNT(event_name)) AS events
  FROM `analytics.events_20200510`
  GROUP BY event_name
)

if you would run it against same data as in your question - result will be

Row json     
1   {"events": [{"session_start": 69},{"screen_view": 510},{"user_engahement": 354}]}   

Upvotes: 1

hlagos
hlagos

Reputation: 7947

This should work

with data as (
  select 'event_type1' as event_type
  union all
  select 'event_type2' as event_type
  union all
  select 'event_type3' as event_type
  union all
  select 'event_type1' as event_type
  union all
  select 'event_type2' as event_type
  union all
  select 'event_type1' as event_type
) 
select TO_JSON_STRING(t) from (
SELECT  array (
select as struct count(*) as event_count, event_type from data group by event_type
) as events
) t;

Output

{"events":[{"event_count":3,"event_type":"event_type1"},{"event_count":2,"event_type":"event_type2"},{"event_count":1,"event_type":"event_type3"}]}

Upvotes: 0

Related Questions