AAlferez
AAlferez

Reputation: 1492

Aggregations of aggregations are not allowed Bigquery

I'm trying to query my google bigquery analytics table. The fields I'm interested in are nested. The structure I want to retrieve fits : Category > Subcategory > Subsubcategory.

I tried to do the following:

select 
event_param1.value.string_value AS category,
event_param2.value.string_value AS action,
ARRAY_AGG(DISTINCT event_param3.value.string_value) AS label
FROM `analytics.events_20*` AS t,
UNNEST(event_params) as event_param1,
UNNEST(event_params) as event_param2,
UNNEST(event_params) as event_param3
where
parse_date('%y%m%d', _table_suffix) between DATE_sub(current_date(), interval 30 day) and DATE_sub(current_date(), interval 1 day) AND
event_param1.key = 'category' and
event_param2.key = 'action' and
event_param3.key = 'label'
group by category, action
order by category, action

But this returns one row with one category, one subcategory, and an array of all subsubcategories.

I want to have one row with one category, all subcategories, all subsubcategories per subcategory.

This is an example of what I get:

{
    "category": "Apple Watch",
    "action": "Apple Badge Clicked",
    "label": [
      "User Landing Page",
      "Attract",
      "Guest Landing Page",
      "Guest In Workout",
      "User In Workout"
    ]
  },
  {
    "category": "Apple Watch",
    "action": "CONNECTED",
    "label": [
      "User Landing Page",
      "Attract",
      "Guest Landing Page",
      "Guest In Workout",
      "User In Workout"
    ]
  }

And this is what I want:

{
    "category": "Apple Watch",
    "action": {
        "Apple Badge Clicked": {
            "label": [
                "User Landing Page",
                "Attract",
                "Guest Landing Page",
                "Guest In Workout",
                "User In Workout"
            ]
        },
        "CONNECTED": {
            "label": [
                "User Landing Page",
                "Attract",
                "Guest Landing Page",
                "Guest In Workout",
                "User In Workout"
            ]
        }
    }
}

If I try an ARRAY_AGG inside another ARRAY_AGG, I get Aggregations of aggregations are not allowed Bigquery. I'm aware that what I am asking is not that simple but a similar solution would work too.

Upvotes: 0

Views: 3824

Answers (1)

Martin Weitzmann
Martin Weitzmann

Reputation: 4746

You need to first aggregate into an array on the highest level. After that, you can re-arrange data using sub-queries:

This is one doesn't reflect your desired output exactly, but is flexible with all kinds of action types:

WITH test AS (
  SELECT * FROM UNNEST([
    STRUCT('Apple Watch' AS category, 'Apple Badge Clicked' as action, 'User Landing Page' as label),
    ('Apple Watch','Apple Badge Clicked','Attract'),
    ('Apple Watch','Apple Badge Clicked','Guest Landing Page'),
    ('Apple Watch','CONNECTED','User Landing Page'),
    ('Apple Watch','CONNECTED','Attract'),
    ('Apple Watch','CONNECTED','User In Workout')
  ])  
),
-- first level of aggregation, prepare for fine tuning
catAgg as (
  SELECT 
    category,
    ARRAY_AGG(struct(action, label)) AS catInfo
  FROM test
  GROUP BY 1
)

SELECT 
  category,
  -- feed sub-query output into an array "action"
  array(SELECT AS STRUCT 
     action as actionType, -- re-group data within the array by field "action"
     array_agg(distinct label) as label
   FROM UNNEST(catInfo)
   GROUP BY 1
   ) as action
FROM catAgg

hope this helps

Upvotes: 2

Related Questions