Reputation: 115
Hey everyone,
I have a big query table that approximately look like this
team_id, channel_id, value_1, value_2,...
A 1111 xxx yyy
B 1111 xxx yyy
C 2222 uuu vvv
D 2222 uuu vvv
E 3333 grr sss
So as you can see, currently, everything is indexed by team_id while all the data that has the same channel_id is same. I need the data to be indexed by channel_id. The result should look like this:
team_id, channel_id, value_1, value_2,...
[A,B] 1111 xxx yyy
[C,D] 2222 uuu vvv
[E] 3333 grr sss
Such that rows with same channel ID are reduced to one row, but the team_id is merged into an array. Is this possible? I am still quite new to SQL/BigQuery and don't know how to do this. I would be happy for some help!
Big thank in advance!
Upvotes: 1
Views: 285
Reputation: 172993
Below is for BigQuery Standard SQL
... all the data that has the same channel_id is same ...
So, below eliminates needs in explicitly listing all those columns so code the same for any number of those columns and their names
#standardSQL
SELECT ARRAY_AGG(team_id) AS team_id, ANY_VALUE(str).*
FROM (
SELECT team_id,
(SELECT AS STRUCT * EXCEPT(team_id) FROM UNNEST([t])) AS str
FROM `project.dataset.table` t
)
GROUP BY TO_JSON_STRING(str)
Depends on your coding preferences - you can use refactored version of above
#standardSQL
SELECT ARRAY_AGG(team_id) AS team_id,
ANY_VALUE((SELECT AS STRUCT * EXCEPT(team_id) FROM UNNEST([t]))).*
FROM `project.dataset.table` t
GROUP BY TO_JSON_STRING((SELECT AS STRUCT * EXCEPT(team_id) FROM UNNEST([t])))
In both cases, if to apply to sample data from your question - the output as below
Row team_id channel_id value_1 value_2
1 A 1111 xxx yyy
B
2 C 2222 uuu vvv
D
3 E 3333 grr sss
Upvotes: 1
Reputation: 1269813
I thin you just want aggregation:
select array_agg(team_id) as team_ids, channel_id, value_1, value_2
from t
group by channel_id, value_1, value_2;
Upvotes: 1