Fabich
Fabich

Reputation: 3059

How to aggregate arrays element by element in BigQuery?

In BigQquery how can I aggregate arrays element by element ?

For instance if I have this table

id array_value
1 [1, 2, 3]
2 [4, 5, 6]
3 [7, 8, 9]

I want to sum all the vector element-wise and output [1+4+7, 2+5+8, 3+6+9] = [12, 15, 18]

I can SUM float fields with SELECT SUM(float_field) FROM table but when I try to apply the SUM on an array I get

No matching signature for aggregate function SUM for argument types: ARRAY.
Supported signatures: SUM(INT64); SUM(FLOAT64); SUM(NUMERIC); SUM(BIGNUMERIC) at [1:8]

I have found ARRAY_AGG in the doc but it is not what I want: it just creates an array from values.

Upvotes: 0

Views: 3073

Answers (4)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

Below is for BigQuery Standard SQL

select array_agg(val order by offset) 
from (
  select offset, sum(val) as val 
  from `project.dataset.table` t, 
  unnest(array_value) as val with offset 
  group by offset
)    

Upvotes: 2

Martin Weitzmann
Martin Weitzmann

Reputation: 4736

I think technically you simply refer to the individual values in the arrays using offset() or safe_offset() in case there might be missing values

-- example data
with temp as (
  select * from unnest([
    struct(1 as id, [1, 2, 3] as array_value),
    (2, [4,5,6]),
    (3, [7,8])
  ])
)

-- actual query
select
  [
    SUM( array_value[safe_offset(0)] ),
    SUM( array_value[safe_offset(1)] ),
    SUM( array_value[safe_offset(2)] )
  ] as result_array
from temp

I put them in a result array, but you don't have to do that. I had the last array missing one value to show that the query doesn't break. If you want it to break you should use offset() without the 'safe_'

Upvotes: 1

GMB
GMB

Reputation: 222432

I think you want:

select array_agg(sum_val order by id) as res
from (
    select idx, sum(val) as sum_val
    from mytable t
    cross join unnest(t.array_value) as val with offset as idx
    group by idx
) t

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269543

I think you want:

select array_agg(sum_val)
from (select (select sum(val)
              from unnest(t.array_value) val
             ) as sum_val
      from t
     ) x

Upvotes: 1

Related Questions