otblabla
otblabla

Reputation: 31

Joining a row to a previous row with a condition (Bigquery sql)

I want to join a table containing a series of sub-events to the first one's id, so i can aggregate on the complete event later. Couldn't find a command that does that in bigquery sql.

Alternatively, it would help if I could create a joint table of each start sub-event to its respective ending, so I can then check for each row when it's between the start and the ending of a session, to determine which session it's related to.

If anyone knows how to do one of the two options, it would be a great help. Thanks!

The table looks like that:

user id         time        part
111  1  01/03/2020 09:00    Start
111  2  01/03/2020 09:01    Middle
111  3  01/03/2020 09:03    end
111  4  01/03/2020 19:00    start
111  5  01/03/2020 19:10    Middle
111  6  01/03/2020 19:11    Middle
111  7  01/03/2020 19:20    End
222  8  02/03/2020 07:32    Start
222  9  02/03/2020 07:35    End
222  10 03/03/2020 20:40    Start
222  11 03/03/2020 20:45    end

and I want it to look like that:

user id         time        part
111  1  01/03/2020 09:00    Start
111  1  01/03/2020 09:01    Middle
111  1  01/03/2020 09:03    end
111  4  01/03/2020 19:00    start
111  4  01/03/2020 19:10    Middle
111  4  01/03/2020 19:11    Middle
111  4  01/03/2020 19:20    End
222  8  02/03/2020 07:32    Start
222  8  02/03/2020 07:35    End
222  10 03/03/2020 20:40    Start
222  10 03/03/2020 20:45    end

Upvotes: 1

Views: 650

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

Below is for BigQuery Standard SQL

#standardSQL
SELECT user,
  MIN(id) OVER(PARTITION BY user, grp) AS id, 
  time, part
FROM (
  SELECT *, 
    COUNTIF(LOWER(part) = 'start') OVER(PARTITION BY user ORDER BY time) grp
  FROM `project.dataset.table`
)

if to apply to sample data from your question - result is

Row user    id  time                part     
1   111     1   01/03/2020 09:00    Start    
2   111     1   01/03/2020 09:01    Middle   
3   111     1   01/03/2020 09:03    end  
4   111     4   01/03/2020 19:00    start    
5   111     4   01/03/2020 19:10    Middle   
6   111     4   01/03/2020 19:11    Middle   
7   111     4   01/03/2020 19:20    End  
8   222     8   02/03/2020 07:32    Start    
9   222     8   02/03/2020 07:35    End  
10  222     10  03/03/2020 20:40    Start    
11  222     10  03/03/2020 20:45    end  

Upvotes: 2

Related Questions