Reputation: 3059
In BigQquery how can I aggregate arrays element by element ?
For instance if I have this table
id | array_value |
---|---|
1 | [1, 2, 3] |
2 | [4, 5, 6] |
3 | [7, 8, 9] |
I want to sum all the vector element-wise and output [1+4+7, 2+5+8, 3+6+9] = [12, 15, 18]
I can SUM float fields with SELECT SUM(float_field) FROM table
but when I try to apply the SUM on an array I get
No matching signature for aggregate function SUM for argument types: ARRAY. Supported signatures: SUM(INT64); SUM(FLOAT64); SUM(NUMERIC); SUM(BIGNUMERIC) at [1:8]
I have found ARRAY_AGG in the doc but it is not what I want: it just creates an array from values.
Upvotes: 0
Views: 3073
Reputation: 172974
Below is for BigQuery Standard SQL
select array_agg(val order by offset)
from (
select offset, sum(val) as val
from `project.dataset.table` t,
unnest(array_value) as val with offset
group by offset
)
Upvotes: 2
Reputation: 4736
I think technically you simply refer to the individual values in the arrays using offset()
or safe_offset()
in case there might be missing values
-- example data
with temp as (
select * from unnest([
struct(1 as id, [1, 2, 3] as array_value),
(2, [4,5,6]),
(3, [7,8])
])
)
-- actual query
select
[
SUM( array_value[safe_offset(0)] ),
SUM( array_value[safe_offset(1)] ),
SUM( array_value[safe_offset(2)] )
] as result_array
from temp
I put them in a result array, but you don't have to do that. I had the last array missing one value to show that the query doesn't break. If you want it to break you should use offset()
without the 'safe_'
Upvotes: 1
Reputation: 222432
I think you want:
select array_agg(sum_val order by id) as res
from (
select idx, sum(val) as sum_val
from mytable t
cross join unnest(t.array_value) as val with offset as idx
group by idx
) t
Upvotes: 2
Reputation: 1269543
I think you want:
select array_agg(sum_val)
from (select (select sum(val)
from unnest(t.array_value) val
) as sum_val
from t
) x
Upvotes: 1