Aravindh Kuppusamy
Aravindh Kuppusamy

Reputation: 1953

In Presto SQL how to create a map of array values and its count

If I have a table like below, how do I get the map of the count of unique values in the arrays in column2?

ID Column1 Column2
1 10 [a, a, b, c]
2 12 [a, a, a]

I would like something like the below:

ID Column1 Column2
1 10 {a: 2, b: 1, c: 1}
2 12 {a: 3}

I tried to use Presto's [histogram][1] for this. But it is an aggregate function that requires group by. I need to use the histogram for each row and not the entire table.

For example,

SELECT distinct ID,
       histogram(column1) AS column1,
       column2
FROM table

returns '"ID"' must be an aggregate expression or appear in GROUP BY clause

Upvotes: 2

Views: 7225

Answers (1)

Guru Stron
Guru Stron

Reputation: 143098

You can use unnest to expand your array into a column and then use histogram over this new column:

WITH dataset AS (
    SELECT *
    FROM (
             VALUES (1, 10, ARRAY['a', 'a', 'b', 'c']),
                    (2, 12, ARRAY['a', 'a', 'a'])
         ) AS t (ID, Column1, Column2))

SELECT
    ID, Column1, histogram(Col2) as Column2
FROM
    dataset
CROSS JOIN unnest(Column2) as t(Col2)
GROUP BY ID, Column1

Result:

ID Column1 Column2
1 10 {a=2, b=1, c=1}
2 12 {a=3}

Upvotes: 5

Related Questions