willem-h
willem-h

Reputation: 145

How to add current session time to each event in BigQuery?

I've got some data that looks similar to this:

input data

I want to add a column that contains the start time of the session that each event occurred in so that the output looks something like this:

desired output data

The session_start_time column is based on the session_start event.

I've tried using partitions in analytic functions but to do so I need values that are the same in each row to start with and if I had that I would have solved my problem.

I've also tried FIRST_VALUE with a window function but I haven't managed to pull only the events where the event_name is "session_start" because I can't see a way to filter inside window functions.

How can I achieve this using Standard SQL on BigQuery?

Below is a sample query that includes the sample data:

WITH user_events AS (
  SELECT
    1 AS user_id,
    'session_start' AS event_name,
    0 AS event_time
  UNION ALL SELECT 1, 'video_play', 2
  UNION ALL SELECT 1, 'ecommerce_purchase', 3
  UNION ALL SELECT 1, 'session_start', 100
  UNION ALL SELECT 1, 'video_play', 105
)

SELECT
  user_id,
  event_name,
  event_time
FROM
  user_events
ORDER BY
  event_time

Upvotes: 0

Views: 303

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

#standardSQL
WITH user_events AS (
  SELECT 1 AS user_id, 'session_start' AS event_name, 0 AS event_time UNION ALL 
  SELECT 1, 'video_play', 2 UNION ALL 
  SELECT 1, 'ecommerce_purchase', 3 UNION ALL 
  SELECT 1, 'session_start', 100 UNION ALL 
  SELECT 1, 'video_play', 105
)
SELECT 
  user_id,
  event_name,
  event_time,
  MIN(event_time) OVER(PARTITION BY user_id, session) AS session_start_time
FROM (
  SELECT
    user_id,
    event_name,
    event_time,
    COUNTIF(event_name='session_start') OVER(PARTITION BY user_id ORDER BY event_time) AS session
  FROM user_events
)
ORDER BY event_time

Upvotes: 1

Related Questions