Reputation: 45
Input Dataset :
Date | ID | Number | Desired |
---|---|---|---|
2020-01-01 | 1 | 8 | 8 |
2020-01-02 | 1 | 11.5 | 19.5 |
2020-01-03 | 1 | -20 | 0 |
2020-01-04 | 1 | 10 | 10 |
2020-01-05 | 1 | -5 | 5 |
2020-01-06 | 2 | -9 | 0 |
2020-01-07 | 2 | 26 | 26 |
2020-01-08 | 2 | 5 | 31 |
2020-01-09 | 2 | -23 | 8 |
2020-01-10 | 2 | -10.5 | 0 |
2020-01-11 | 2 | 5 | 5 |
I am trying to create a custom running total that resets to 0 everytime the running total dips below 0.
My code
CREATE TEMP FUNCTION conditional_sum(X ARRAY<FLOAT64>)
RETURNS FLOAT64
LANGUAGE js as r"""
if (!('sum' in this)) this.sum =0
return this.sum = (X + this.sum) <0 ? 0 : this.sum + X
""";
select *,conditional_sum(array_agg(number) over (partition by ID order by Date)) as run_sum from tbl
The error that I get is
Failed to coerce output value "11.5,-20" to type FLOAT64. Bad double value
I borrowed this code from Running total of positive and negative numbers where the sum cannot go below zero and it works well without the window function and arrary_agg. But I need the window function because I need to apply the conditional sum to multiple IDs in Bigquery SQL
The other udf errors don't deal with window functions
Upvotes: 1
Views: 83
Reputation: 3538
Please add a for loop in the JavaScript code. Generate the array only to the current row.
CREATE TEMP FUNCTION conditional_sum(X ARRAY<FLOAT64>)
RETURNS FLOAT64
LANGUAGE js as r"""
var cur=0;
var out=[];
for(let x of X){
cur+=x*1;
if(cur<0) cur=0;
//out.push(cur)
}
return cur;
// return out;
""";
With tbl as (
SELECT '2020-01-01' AS Date, 1 AS ID, 8 AS Number, 8 AS Desired
UNION ALL SELECT '2020-01-02', 1, 11.5, 19.5
UNION ALL SELECT '2020-01-03', 1, -20, 0
UNION ALL SELECT '2020-01-04', 1, 10, 10
UNION ALL SELECT '2020-01-05', 1, -5, 5
UNION ALL SELECT '2020-01-06', 2, -9, 0
UNION ALL SELECT '2020-01-07', 2, 26, 26
UNION ALL SELECT '2020-01-08', 2, 5, 31
UNION ALL SELECT '2020-01-09', 2, -23, 8
UNION ALL SELECT '2020-01-10', 2, -10.5, 0
UNION ALL SELECT '2020-01-11', 2, 5, 5
)
select *,conditional_sum([1.1,2,1]),
string_agg(cast(number as float64) || '' ) over win1,
conditional_sum(array_agg(cast(number as float64)) over win1 ) as run_sum
from tbl
window win1 as (partition by ID order by Date rows between unbounded preceding and current row)
Upvotes: 2