bbk611
bbk611

Reputation: 321

BigQuery subtract elements in array

I was wondering how I could calculate the difference between elements in arrays that have only two elements, by subtracting the lesser from the greater.

I started by referring to the code sample from the BQ documentation (https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays#arrays_and_aggregation) with a few tweaks of my own to leave only 2 elements in the arrays. I know this example is for SUMs, but I was wondering if there's a way to modify it to calculate differences.

WITH sequences AS
  (SELECT [0, 5] AS some_numbers
   UNION ALL SELECT [2, 32] AS some_numbers
   UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
  (SELECT SUM(x)
   FROM UNNEST(s.some_numbers) x) AS sums
FROM sequences s;

These are my desired results.

+--------------------+------+
| some_numbers       | diff |
+--------------------+------+
| [0, 5]             | 5    |
| [2, 32]            | 30   |
| [5, 15]            | 10   |
+--------------------+------+

Upvotes: 1

Views: 2041

Answers (3)

Alexandre Moraes
Alexandre Moraes

Reputation: 4051

As an alternative to the answers already posted to further contribute with the comunity, I would suggest the use of User Defined Functions (UDF) in BigQuery.

This resource allows you to create functions in SQL or JavaScript, which accept columns of input and execute actions, returning the desired result after the transformations. I would also add that they are easy to modify in case it is needed (from a sum to a subtraction function, for instance). You can read more about it here.

I created a UDF using JavaScript, it is below:

CREATE TEMP FUNCTION
  sub_arr (arr ARRAY<int64>)
  RETURNS int64
  LANGUAGE js AS '''
arr_sub=0
 for(var i = 0; i < arr.length; i++){
    arr_sub = parseInt(arr[i]) - arr_sub ;
  };
  return arr_sub;
''';

#Sample data
WITH
  data AS (
  SELECT [2, 7] AS some_array UNION ALL
  SELECT [10, 32] AS some_array UNION ALL
  SELECT [15,20] AS some_array )

#query using the UDF
SELECT sub_arr(some_array) AS result_diff
FROM data

and the output,

enter image description here

I would like to point that you have to specify the type of array you will use in the function, in this case ARRAY<int64>. Furthermore, according to the documentation Javascript's UDF does not accept INT64 as a return type and it converts to FLOAT64, it is a good practice to pay attention in the data types you are using.

Upvotes: 2

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173046

Below is for BigQuery Standard SQL

#standardSQL
WITH sequences AS (
  SELECT [0, 5] AS some_numbers UNION ALL 
  SELECT [2, 32] AS some_numbers UNION ALL 
  SELECT [5, 10] AS some_numbers
)
SELECT some_numbers, 
  ABS(some_numbers[OFFSET(0)] - some_numbers[OFFSET(1)]) diff
FROM sequences   

with output

enter image description here

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1270401

How about just subtracting?

WITH sequences AS (
   SELECT [0, 5] AS some_numbers union all
   SELECT [2, 32] AS some_numbers union all
   SELECT [5, 10] AS some_numbers
   )
select some_numbers,
       some_numbers[ordinal(2)] - some_numbers[ordinal(1)]
from sequences;

Upvotes: 1

Related Questions