Reputation: 1062
BigQuery with Legacy SQL has a pretty convenient QUANTILES
function to quickly get a histogram of values in a table without specifying the buckets by hand.
I can't find a nice equivalent in aggregation functions available in Standard SQL. Did I miss something obvious, or otherwise, what's the standard way of emulating it?
Upvotes: 5
Views: 19546
Reputation: 335
If approximate values don't work for you, you can use the PERCENTILE_CONT function.
Upvotes: 0
Reputation: 33745
You're looking for the APPROX_QUANTILES
function :) One of the examples from the docs is:
#standardSQL
SELECT APPROX_QUANTILES(x, 2) AS approx_quantiles
FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x;
+------------------+
| approx_quantiles |
+------------------+
| [1, 5, 10] |
+------------------+
Note that it returns an array, but if you want the elements of the array as individual rows, you can unnest the result:
#standardSQL
SELECT
quant, offset
FROM UNNEST((
SELECT APPROX_QUANTILES(x, 2) AS quants
FROM UNNEST([NULL, NULL, 1, 1, 1, 4, 5, 6, 7, 8, 9, 10]) AS x
)) AS quant WITH OFFSET
ORDER BY offset;
Upvotes: 17