Reputation: 145
I've got some data that looks similar to this:
I want to add a column that contains the start time of the session that each event occurred in so that the output looks something like this:
The session_start_time
column is based on the session_start
event.
I've tried using partitions in analytic functions but to do so I need values that are the same in each row to start with and if I had that I would have solved my problem.
I've also tried FIRST_VALUE
with a window function but I haven't managed to pull only the events where the event_name
is "session_start" because I can't see a way to filter inside window functions.
How can I achieve this using Standard SQL on BigQuery?
Below is a sample query that includes the sample data:
WITH user_events AS (
SELECT
1 AS user_id,
'session_start' AS event_name,
0 AS event_time
UNION ALL SELECT 1, 'video_play', 2
UNION ALL SELECT 1, 'ecommerce_purchase', 3
UNION ALL SELECT 1, 'session_start', 100
UNION ALL SELECT 1, 'video_play', 105
)
SELECT
user_id,
event_name,
event_time
FROM
user_events
ORDER BY
event_time
Upvotes: 0
Views: 303
Reputation: 172974
#standardSQL
WITH user_events AS (
SELECT 1 AS user_id, 'session_start' AS event_name, 0 AS event_time UNION ALL
SELECT 1, 'video_play', 2 UNION ALL
SELECT 1, 'ecommerce_purchase', 3 UNION ALL
SELECT 1, 'session_start', 100 UNION ALL
SELECT 1, 'video_play', 105
)
SELECT
user_id,
event_name,
event_time,
MIN(event_time) OVER(PARTITION BY user_id, session) AS session_start_time
FROM (
SELECT
user_id,
event_name,
event_time,
COUNTIF(event_name='session_start') OVER(PARTITION BY user_id ORDER BY event_time) AS session
FROM user_events
)
ORDER BY event_time
Upvotes: 1