justAbit
justAbit

Reputation: 4256

BigQuery - loop and previous value reference

I have a logic in some script that I want to implement using BigQuery. Below is pseudo code:

// assuming that 'items' is populated from some BQ table
items = ["A", "B", "C", "D", "E"]
var1 = "v1"   //initialize variable
var2 = "v2"   //initialize variable

for item in items
    var1 = var2
    var2 = var1 + item
    print(item, var1, var2)

Output:

item  var1      var2
A     v2        v2A
B     v2A       v2AB
C     v2AB      v2ABC
D     v2ABC     v2ABCD
E     v2ABCD    v2ABCDE

I'm trying to use BigQuery window and lag() without any luck till now. Below is the query that I'm trying in BQ:

WITH t AS (SELECT 'A' item UNION ALL
            SELECT 'B' as item UNION ALL
            SELECT 'C' as item UNION ALL
            SELECT 'D' as item UNION ALL
            SELECT 'E' as item)
select item, 
  if (rn = 1, 'v2', lag(concat(var2, item)) over(order by 1)) var1,
  var1 var2
  from (select item, row_number() over() as rn, 'v1' var1, 'v2' var2 from t)

Let me know if there is any workaround.

Upvotes: 1

Views: 368

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

Below is for BigQuery Standard SQL - to help get you started

#standardSQL
SELECT item, 
  CONCAT('v2', IFNULL(STRING_AGG(item, '') OVER(ORDER BY item ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), '')) var1,
  CONCAT('v2', STRING_AGG(item, '') OVER(ORDER BY item ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)) var2
FROM `project.dataset.table`

If to apply to sample data from your question - as in below example -

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'A' item UNION ALL
  SELECT 'B' AS item UNION ALL
  SELECT 'C' AS item UNION ALL
  SELECT 'D' AS item UNION ALL
  SELECT 'E' AS item
)
SELECT item, 
  CONCAT('v2', IFNULL(STRING_AGG(item, '') OVER(ORDER BY item ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), '')) var1,
  CONCAT('v2', STRING_AGG(item, '') OVER(ORDER BY item ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)) var2
FROM `project.dataset.table`
-- ORDER BY item

result is

Row     item    var1        var2     
1       A       v2          v2A  
2       B       v2A         v2AB     
3       C       v2AB        v2ABC    
4       D       v2ABC       v2ABCD   
5       E       v2ABCD      v2ABCDE    

With your real data, you most likely have some column that you can use for desired ordering of your items in output vars - for the sake of example I just used ordering by items themselves

As a varioation of above script - you can use below version

#standardSQL
SELECT item, var1, CONCAT(var1, item) var2 FROM (
  SELECT item, CONCAT('v2', IFNULL(var, '')) var1 FROM (
    SELECT item, 
      STRING_AGG(item, '') OVER(ORDER BY item ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) var
    FROM `project.dataset.table`
  )
)

Upvotes: 2

Related Questions