Reputation: 328
Is there anyway to flatten the data that is exported to BigQuery? An example of data would be this:
Can I convert the data so that there is just a column for every key with the value below it? So a column firebase_screen_id with the value 8966... for row 1 and a column with board with the value 'M' for row 1 and 'S' for row 2 etc. I have tried using the UNNEST function and it helps me in selecting the values but it still doesn't convert the keys to columns. I've been bashing my head against this all afternoon and finally decided that I'm completely stuck! haha. I would appreciate any tips or help I can get :) Cheers!!
Upvotes: 0
Views: 574
Reputation: 2365
You can create the functions below in BQ, and then use them everywhere you want.
Functions:
CREATE OR REPLACE FUNCTION func.get_param_int
(event_params ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, decimal_value FLOAT64>>>, param_key STRING)
AS
((
SELECT COALESCE(
value.int_value,
CAST(value.decimal_value as int64),
SAFE_CAST(value.string_value as int64)
)
FROM UNNEST(event_params) WHERE key = param_key
));
CREATE OR REPLACE FUNCTION func.get_param_str
(event_params ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, decimal_value FLOAT64>>>, param_key STRING)
AS
((
SELECT COALESCE(
value.string_value,
CAST(value.decimal_value as string),
CAST(value.int_value as string)
)
FROM UNNEST(event_params) WHERE key = param_key
));
CREATE OR REPLACE FUNCTION func.get_param_double
(event_params ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, decimal_value FLOAT64>>>, param_key STRING)
AS
((
SELECT COALESCE(
value.decimal_value,
CAST(value.int_value as float64),
SAFE_CAST(value.string_value as float64)
)
FROM UNNEST(event_params) WHERE key = param_key
));
Usage:
select
user_pseudo_id,
timestamp_micros(event_timestamp) as ts,
event_name,
func.get_param_int(event_params, 'firebase_screen_id'),
func.get_param_str(event_params, 'firebase_screen')
from `project.analytics_xxx.events_yyyymmdd`
Upvotes: 1
Reputation: 833
Considering the data you added as image. I have considered this sample data below
After applying below query to sample data, the output is
--Note - cte is a sample data as per your image
with cte as
(
select struct(
'20180915' as event_date
,['firebase_screen_id','board','value'] as key
,struct (
['1','M','null'] as string_value
) as value
) as event_params
union all
select struct(
'20180916' as event_date
,['firebase_screen_id','board','value'] as key
,struct (
['2','S','null'] as string_value
) as value
) as event_params
)
,cte2 as
(
select event_params.event_date,epkey ,event_params.value.string_value[offset (idx)] as epval
from cte,
unnest(event_params.key) as epkey with offset as idx
)
select *
from cte2
pivot(any_value(epval) for epkey in ('firebase_screen_id','board','value'))
Upvotes: 0