ishan3243
ishan3243

Reputation: 1928

Aggregate arrays element-wise in presto/athena

I have a table which has an array column. The size of the array is guaranteed to be same in all rows. Is it possible to do an element-wise aggregation on the arrays to create a new array?

For e.g. if my aggregation is the avg function then:

Array 1: [1,3,4,5]
Array 2: [3,5,6,1]

Output:  [2,4,5,3]

I would want to write queries like these:

select
  timestamp_column,
  avg(array_column) as new_array
from 
  my_table
group by 
  timestamp_column

The array contains close to 200 elements, so I would prefer not to hardcode each element in the query :)

Upvotes: 4

Views: 6050

Answers (1)

Dain Sundstrom
Dain Sundstrom

Reputation: 2858

This can be done by combining 2 lesser known SQL constructs: UNNEST WITH ORDINALITY, and array_agg with ORDER BY.

The first step is to unpack the arrays into rows usingCROSS JOIN UNNEST(a) WITH ORDINALITY. For each element in each array, it will output a row containing the element value and the position of that element in the array.

Then you use a standardard GROUP BY on the ordinal, and sum the values.

Finally, you reassemble the sums back into an array using array_agg(value_sum ORDER BY ordinal). The critical part of this expression is the ORDER BY clause in the array_agg call. Without this the values would be an an arbitrary order.

Here is a full example:

WITH t(a) AS (VALUES array [1, 3, 4, 5], array [3, 5, 6, 1])
SELECT array_agg(value_sum ORDER BY ordinal)
FROM (
    SELECT ordinal, sum(value) AS value_sum
    from t
             CROSS JOIN UNNEST(t.a) WITH ORDINALITY AS x(value, ordinal)
    GROUP BY ordinal);

Upvotes: 7

Related Questions