Yaro
Yaro

Reputation: 19

One of the previous row value with condition

I'm generating a table like this in BigQuery with standard GoogleSQL:

sessionID hitNumber ListViewID Position Sponsored isClick impressions clicks
aaa20230425 1 aaa20230425-1 2 1 false 1 0
bbb20230425 1 bbb20230425-1 6 0 false 1 0
bbb20230425 1 bbb20230425-1 7 0 false 1 0
bbb20230425 1 bbb20230425-1 8 0 false 1 0
bbb20230425 1 bbb20230425-1 9 0 false 1 0
bbb20230425 1 bbb20230425-1 10 0 false 1 0
bbb20230425 1 bbb20230425-1 11 0 false 1 0
aaa20230425 2 aaa20230425-2 2 1 true 0 1
ccc20230425 1 ccc20230425-1 0 1 false 1 0
ddd20230425 17 ddd20230425-17 1 1 false 1 0

My code:

SELECT
    CONCAT(fullVisitorID, CAST(visitID AS string), date) AS sessionID,
    hits.hitNumber AS hitNumber,
    CONCAT( fullvisitorID, CAST(visitId AS STRING), date, hits.hitNumber ) AS ListViewID,
    product.productListPosition AS Position,
    customDimensions.value AS Sponsored,
    (CASE
         WHEN product.isClick = TRUE THEN TRUE
         ELSE FALSE
     END) AS isClick,
    SUM(
  IF
    (product.isImpression, 1,0)) AS impressions,
  SUM(
  IF
    (product.isClick,1,0)) AS clicks
FROM
    `Table`,
UNNEST (hits) AS hits,
UNNEST (hits.product) AS product,
UNNEST (product.customDimensions) AS customDimensions
WHERE
    product.productListName IN ("searchpage", "categorypage")
    AND customDimensions.index = 68
GROUP BY
    SessionID, hitNumber, ListViewID, Position,
    Sponsored, isClick

I need to create an additional column "parent_ListViewID" which:

Given the example above, I need to have a table like this:

sessionID hitNumber ListViewID Position Sponsored isClick impressions clicks parent_ListViewID
aaa20230425 1 aaa20230425-1 2 1 false 1 0 aaa20230425-1
bbb20230425 1 bbb20230425-1 6 0 false 1 0 bbb20230425-1
bbb20230425 1 bbb20230425-1 7 0 false 1 0 bbb20230425-1
bbb20230425 1 bbb20230425-1 8 0 false 1 0 bbb20230425-1
bbb20230425 1 bbb20230425-1 9 0 false 1 0 bbb20230425-1
bbb20230425 1 bbb20230425-1 10 0 false 1 0 bbb20230425-1
bbb20230425 1 bbb20230425-1 11 0 false 1 0 bbb20230425-1
aaa20230425 2 aaa20230425-2 2 1 true 0 1 aaa20230425-1
ccc20230425 1 ccc20230425-1 0 1 false 1 0 ccc20230425-1
ddd20230425 17 ddd20230425-17 1 1 false 1 0 ddd20230425-17

I've tried a solution with LAG() here but couldn't get it to work: Lag() with condition in sql server

I've also tried a few solutions with last_value, but failed as well - it returns last ListViewID value across all values with the same sessionID, not just from the previous rows.

Upvotes: 0

Views: 218

Answers (1)

Jaytiger
Jaytiger

Reputation: 12234

You can consider below.

-- sample data
WITH sample_table AS (
  -- put your sample data here
)
-- query starts here
SELECT *,
       IF (
         isClick IS TRUE,
         LAST_VALUE(IF(isClick IS FALSE, ListViewID, NULL) IGNORE NULLS) OVER w,
         ListViewID
       ) AS parent_ListViewID
  FROM sample_table
WINDOW w AS (PARTITION BY sessionID ORDER BY hitNumber);

Query result

enter image description here

  • assume that hitNumber is INT64 type.

Upvotes: 1

Related Questions