CaitlynCodr
CaitlynCodr

Reputation: 328

How can I actually flatten the firebase data in BigQuery so that every parameter key is a column?

Is there anyway to flatten the data that is exported to BigQuery? An example of data would be this:

Text

Can I convert the data so that there is just a column for every key with the value below it? So a column firebase_screen_id with the value 8966... for row 1 and a column with board with the value 'M' for row 1 and 'S' for row 2 etc. I have tried using the UNNEST function and it helps me in selecting the values but it still doesn't convert the keys to columns. I've been bashing my head against this all afternoon and finally decided that I'm completely stuck! haha. I would appreciate any tips or help I can get :) Cheers!!

Upvotes: 0

Views: 574

Answers (2)

Sabri Karagönen
Sabri Karagönen

Reputation: 2365

You can create the functions below in BQ, and then use them everywhere you want.

Functions:

CREATE OR REPLACE FUNCTION func.get_param_int
  (event_params ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, decimal_value FLOAT64>>>, param_key STRING) 
AS 
((
  SELECT COALESCE(
    value.int_value,
    CAST(value.decimal_value as int64),
    SAFE_CAST(value.string_value as int64)
  )
  FROM UNNEST(event_params) WHERE key = param_key
));

CREATE OR REPLACE FUNCTION func.get_param_str
  (event_params ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, decimal_value FLOAT64>>>, param_key STRING) 
AS 
((
  SELECT COALESCE(
    value.string_value,
    CAST(value.decimal_value as string),
    CAST(value.int_value as string)
  )
  FROM UNNEST(event_params) WHERE key = param_key
));

CREATE OR REPLACE FUNCTION func.get_param_double
  (event_params ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, decimal_value FLOAT64>>>, param_key STRING) 
AS 
((
  SELECT COALESCE(
    value.decimal_value,
    CAST(value.int_value as float64),
    SAFE_CAST(value.string_value as float64)
  )
  FROM UNNEST(event_params) WHERE key = param_key
));

Usage:

select 
  user_pseudo_id, 
  timestamp_micros(event_timestamp) as ts, 
  event_name, 
  func.get_param_int(event_params, 'firebase_screen_id'), 
  func.get_param_str(event_params, 'firebase_screen')
from `project.analytics_xxx.events_yyyymmdd`

Upvotes: 1

Mr.Batra
Mr.Batra

Reputation: 833

Considering the data you added as image. I have considered this sample data below enter image description here

After applying below query to sample data, the output is enter image description here

--Note - cte is a sample data as per your image
with cte as
(
    select struct(
        '20180915' as event_date
        ,['firebase_screen_id','board','value'] as key
        ,struct (
            ['1','M','null'] as string_value
        ) as value
    ) as event_params

    union all 

    select struct(
        '20180916' as event_date
        ,['firebase_screen_id','board','value'] as key
        ,struct (
            ['2','S','null'] as string_value
        ) as value
    ) as event_params
)
,cte2 as
(
    select event_params.event_date,epkey ,event_params.value.string_value[offset (idx)] as epval
    from cte,
    unnest(event_params.key) as epkey with offset as idx
)
select  * 
from cte2
pivot(any_value(epval) for epkey in ('firebase_screen_id','board','value'))

Upvotes: 0

Related Questions