Reputation: 46
I have a UDF which returns a floating point array of the same size for each row of a table. How do I sum values of these arrays ?
In other words, how can I do something like this:
create temp function f(...)
returns array<float64>
...;
select sum(f(column)) from table
As the result of this operation I need to get another array of equal size where
result[i] = sum(over rows) f(row, column)[i]
Upvotes: 0
Views: 2608
Reputation: 6282
This does not directly answer OP's question, but people landing on this page searching for "How do I add arrays in BigQuery SQL?" might benefit.
(Based on @elliott-brossard answer edit) In case you have 2 arrays, but 1 array includes a struct, you can use the following code to add them together:
WITH mydata AS (
SELECT
[1, 2, 3] AS arr
-- ,[7, 8, 9] AS arr2
,[
STRUCT(7 AS timeOnSite)
,STRUCT(8 AS timeOnSite)
,STRUCT(9 AS timeOnSite)
] AS arr2
)
SELECT
(
SELECT
ARRAY_AGG(sum ORDER BY off) AS arr
FROM (
SELECT
off,
SUM(x) AS sum
FROM (
SELECT arr UNION ALL
-- SELECT arr2
SELECT (SELECT ARRAY_AGG(t.timeOnSite) FROM UNNEST(arr2) AS t)
), UNNEST(arr) AS x WITH OFFSET off
GROUP BY off
)
) AS sum_arrays
FROM
mydata
Upvotes: 0
Reputation: 7277
If you have your UDF defined (takes in a your column(s) and returns a float64
array of a pre-determined (or fixed) dimensions), you can use a simplified solution. For example in case of 3-d arrays, something like:
create temp function f(...)
returns array<float64>
...;
with dataset as (
select arr[offset(0)] as col_a, arr[offset(1)] as col_b, arr[offset(2)] as col_c
from (
select f(mycolumn) as arr
from `mydataset.mytable`
)
)
select [sum(col_a), sum(col_b), sum(col_c)] as new_array from dataset
Upvotes: 0
Reputation: 33705
Here is a function that uses ANY TYPE
in order to support summing arrays of FLOAT64
, INT64
, or NUMERIC
along with some sample input:
CREATE TEMP FUNCTION ElementWiseSum(arr1 ANY TYPE, arr2 ANY TYPE) AS (
ARRAY(SELECT x + arr2[OFFSET(off)] FROM UNNEST(arr1) AS x WITH OFFSET off ORDER BY off)
);
SELECT arr1, arr2, ElementWiseSum(arr1, arr2) AS result
FROM (
SELECT [1, 2, 3] AS arr1, [4, 5, 6] AS arr2 UNION ALL
SELECT [7, 8], [9, 10] UNION ALL
SELECT [], [] UNION ALL
SELECT [11, 12, 13, 14, 15], [16, 17, 18, 19, 20]
);
It unnests arr1
using WITH OFFSET
, then retrieves the equivalent element from arr2
using this offset, and orders by the offset to ensure that the element order is preserved.
Edit: to sum across rows, you can unnest the arrays, compute sums grouped by the offset of the elements, then reaggregate the sums into a new array:
SELECT
ARRAY_AGG(sum ORDER BY off) AS arr
FROM (
SELECT
off,
SUM(x) AS sum
FROM (
SELECT [1, 2, 3] AS arr UNION ALL
SELECT [7, 8, 9] UNION ALL
SELECT [4, 5, 6] UNION ALL
SELECT [10, 11, 12]
), UNNEST(arr) AS x WITH OFFSET off
GROUP BY off
);
Upvotes: 2
Reputation: 7947
So based on your comment, what you are looking for is the sum
the values of all your arrays. This is how you can do it using UNNEST
operator
WITH mydata AS (
SELECT [1.4, 1.3, 1.4, 1.1] as myarray
union all
SELECT [1.4, 1.3, 1.4, 1.1] as myarray
union all
SELECT [1.4, 1.3, 1.4, 1.1] as myarray
)
SELECT SUM(eachelement) from mydata, UNNEST(myarray) AS eachelement;
Upvotes: 1