Reputation: 1953
If I have a table like below, how do I get the map of the count of unique values in the arrays in column2?
ID | Column1 | Column2 |
---|---|---|
1 | 10 | [a, a, b, c] |
2 | 12 | [a, a, a] |
I would like something like the below:
ID | Column1 | Column2 |
---|---|---|
1 | 10 | {a: 2, b: 1, c: 1} |
2 | 12 | {a: 3} |
I tried to use Presto's [histogram][1]
for this. But it is an aggregate function that requires group by
. I need to use the histogram
for each row and not the entire table.
For example,
SELECT distinct ID,
histogram(column1) AS column1,
column2
FROM table
returns
'"ID"' must be an aggregate expression or appear in GROUP BY clause
Upvotes: 2
Views: 7225
Reputation: 143098
You can use unnest
to expand your array into a column and then use histogram
over this new column:
WITH dataset AS (
SELECT *
FROM (
VALUES (1, 10, ARRAY['a', 'a', 'b', 'c']),
(2, 12, ARRAY['a', 'a', 'a'])
) AS t (ID, Column1, Column2))
SELECT
ID, Column1, histogram(Col2) as Column2
FROM
dataset
CROSS JOIN unnest(Column2) as t(Col2)
GROUP BY ID, Column1
Result:
ID | Column1 | Column2 |
---|---|---|
1 | 10 | {a=2, b=1, c=1} |
2 | 12 | {a=3} |
Upvotes: 5