blah
blah

Reputation: 85

Partial Rolled Up Aggregations in Bigquery

Is there a way to do a rolled up aggregation on only certain GROUP BY columns instead of all GROUP BY columns? Instead of SELECT a,b,SUM(c) FROM sample.data GROUP BY a,b or SELECT a,b,SUM(c) FROM sample.data GROUP BY ROLLUP(a,b), I am trying to achieve SELECT a,b,SUM(c) FROM sample.data GROUP BY a, ROLLUP(b). Is this doable in BigQuery?

The only way I can imagine doing it right now is by creating a GROUPING column for each column I do not wish to be rolled up and filtering those columns to only 0. However, when trying this I get the error Errors encountered during job execution. Resources exceeded during query execution. that I do not get when not including a ROLLUP(). (Same error even with GROUP EACH BY.)

Thoughts?

Upvotes: 6

Views: 8361

Answers (3)

JosMac
JosMac

Reputation: 2332

RollUp is on Bigquery available in Legacy SQL (https://cloud.google.com/bigquery/docs/reference/legacy-sql#groupby) which is not exactly great if you also need to use some features from Standard SQL.

Upvotes: 1

selitsky
selitsky

Reputation: 44

MS SQL support groups like these:

GROUP BY a, ROLLUP(b, c)
GROUP BY ROLLUP(a, b), ROLLUP(c, d)

See article https://technet.microsoft.com/en-us/library/bb522495(v=sql.105).aspx

Google BQ still doesn't support.

You can use HAVING clause, like this:

SELECT a,b,SUM(c) FROM sample.data GROUP BY ROLLUP(a,b) 
HAVING a is not null

OR

SELECT a,b,SUM(c) FROM sample.data GROUP BY ROLLUP(a,b) 
HAVING GROUPING(a)=0

This solution doesn't solve problem with Error: Resources exceeded during query execution.

I suppose, that full support of rollup, cube and grouping sets is a good feature for future BQ releases.

Upvotes: 1

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173190

I dont think sysntax like GROUP BY a, ROLLUP(b) is available! It is not!
Just as a workaround for your case I would propose below mimicing of such syntax

So, assuming you have table sample.data with a, b, c, d and you need to mimic

SELECT a, b, c, SUM(d) as s 
FROM table 
GROUP BY a, b, ROLLUP(c)  

Below is ROLLUP-less "version" of it :

SELECT a, b, c, s 
FROM (
  SELECT a, b, c, SUM(d) AS s 
  FROM sample.data
  GROUP BY a, b, c
  ), (
  SELECT a, b, NULL AS c, SUM(s) AS s
  FROM (
    SELECT a, b, c, SUM(d) AS s 
    FROM sample.data
    GROUP BY a, b, c
  ) 
  GROUP BY 1, 2, 3
)
ORDER BY a, b, c

Upvotes: 2

Related Questions