igorvm
igorvm

Reputation: 46

How do I add arrays in BigQuery SQL?

I have a UDF which returns a floating point array of the same size for each row of a table. How do I sum values of these arrays ?

In other words, how can I do something like this:

create temp function f(...)
returns array<float64>
...;
select sum(f(column)) from table

As the result of this operation I need to get another array of equal size where

result[i] = sum(over rows) f(row, column)[i]

Upvotes: 0

Views: 2608

Answers (4)

NumesSanguis
NumesSanguis

Reputation: 6282

This does not directly answer OP's question, but people landing on this page searching for "How do I add arrays in BigQuery SQL?" might benefit.

(Based on @elliott-brossard answer edit) In case you have 2 arrays, but 1 array includes a struct, you can use the following code to add them together:

WITH mydata AS (
  SELECT
    [1, 2, 3] AS arr
    -- ,[7, 8, 9] AS arr2
    ,[
      STRUCT(7 AS timeOnSite)
      ,STRUCT(8 AS timeOnSite)
      ,STRUCT(9 AS timeOnSite)
    ] AS arr2
)

SELECT
  (
    SELECT
      ARRAY_AGG(sum ORDER BY off) AS arr
    FROM (
      SELECT
        off,
        SUM(x) AS sum
      FROM (
        SELECT arr UNION ALL
        -- SELECT arr2
        SELECT (SELECT ARRAY_AGG(t.timeOnSite) FROM UNNEST(arr2) AS t)
      ), UNNEST(arr) AS x WITH OFFSET off
      GROUP BY off
    )  
  ) AS sum_arrays
FROM 
  mydata

Upvotes: 0

saifuddin778
saifuddin778

Reputation: 7277

If you have your UDF defined (takes in a your column(s) and returns a float64 array of a pre-determined (or fixed) dimensions), you can use a simplified solution. For example in case of 3-d arrays, something like:

create temp function f(...)
returns array<float64>
...;

with dataset as (
  select arr[offset(0)] as col_a, arr[offset(1)] as col_b, arr[offset(2)] as col_c
    from (
       select f(mycolumn) as arr
       from `mydataset.mytable`
    )
)

select [sum(col_a), sum(col_b), sum(col_c)] as new_array from dataset

Upvotes: 0

Elliott Brossard
Elliott Brossard

Reputation: 33705

Here is a function that uses ANY TYPE in order to support summing arrays of FLOAT64, INT64, or NUMERIC along with some sample input:

CREATE TEMP FUNCTION ElementWiseSum(arr1 ANY TYPE, arr2 ANY TYPE) AS (
  ARRAY(SELECT x + arr2[OFFSET(off)] FROM UNNEST(arr1) AS x WITH OFFSET off ORDER BY off)
);

SELECT arr1, arr2, ElementWiseSum(arr1, arr2) AS result
FROM (
  SELECT [1, 2, 3] AS arr1, [4, 5, 6] AS arr2 UNION ALL
  SELECT [7, 8], [9, 10] UNION ALL
  SELECT [], [] UNION ALL
  SELECT [11, 12, 13, 14, 15], [16, 17, 18, 19, 20]
);

It unnests arr1 using WITH OFFSET, then retrieves the equivalent element from arr2 using this offset, and orders by the offset to ensure that the element order is preserved.

Edit: to sum across rows, you can unnest the arrays, compute sums grouped by the offset of the elements, then reaggregate the sums into a new array:

SELECT
  ARRAY_AGG(sum ORDER BY off) AS arr
FROM (
  SELECT
    off,
    SUM(x) AS sum
  FROM (
    SELECT [1, 2, 3] AS arr UNION ALL
    SELECT [7, 8, 9] UNION ALL
    SELECT [4, 5, 6] UNION ALL
    SELECT [10, 11, 12]
  ), UNNEST(arr) AS x WITH OFFSET off
  GROUP BY off
);

Upvotes: 2

hlagos
hlagos

Reputation: 7947

So based on your comment, what you are looking for is the sum the values of all your arrays. This is how you can do it using UNNEST operator

WITH mydata  AS (
  SELECT [1.4, 1.3, 1.4, 1.1] as myarray
  union all 
  SELECT [1.4, 1.3, 1.4, 1.1] as myarray
  union all 
  SELECT [1.4, 1.3, 1.4, 1.1] as myarray
)

SELECT SUM(eachelement) from mydata, UNNEST(myarray) AS eachelement; 

Upvotes: 1

Related Questions