markzz
markzz

Reputation: 1262

How do you aggregate two columns into arrays in BigQuery?

Say I have a table with 4 columns, a of type string, b of type integer, c of type integer, and d of type integer. How would I go ahead and use something like ARRAY_AGG on a STRUCT(b, c) and d separately (in other words, have two separate columns that would be arrays)?

The Query I have so far:

SELECT table1.a, table1.x, table2.y
FROM (
  SELECT a, ARRAY_AGG(STRUCT(b, c)) AS x
  FROM project.table
  GROUP BY a
  ORDER BY a
) AS table1
LEFT OUTER JOIN (
  SELECT a, ARRAY_AGG(d) AS y
  FROM project.table
  GROUP BY a, d
  ORDER BY a, d
) table2 ON table1.a = table2.a
GROUP BY table1.a
ORDER BY table1.a;

This gives me the error: SELECT list expression references table1.x which is neither grouped nor aggregated at [1:20]

But if I try to add table1.x to the GROUP BY clause at the end, I get a new error: Grouping by expressions of type ARRAY is not allowed at [14:22]

Upvotes: 0

Views: 3965

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173190

Below is for BigQuery Standard SQL

#standardSQL
SELECT a, 
  ARRAY(
    SELECT AS STRUCT b, c FROM t.x GROUP BY b, c
  ) AS x, y
FROM (
  SELECT a,  
    ARRAY_AGG(STRUCT(b, c)) AS x, 
    ARRAY_AGG(DISTINCT d) AS y
  FROM `project.dataset.table`
  GROUP BY a
) t

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1271003

Why not just do this in one query?

SELECT a, ARRAY_AGG(STRUCT(b, c)) AS x, ARRAY_AGG(DISTINCT d) AS y
FROM project.table
GROUP BY a
ORDER BY a;

Your version will work without the outer GROUP BY. But it is needlessly compicated.

Upvotes: 0

Related Questions