Reputation: 165
I have a table from Google Analytics 4. Some of the emails are not shows and I can identify that it is the same user according to the same session id.
event_date email event_timestamp ga_session_id
24/03/2021 null 2021-03-24 21:59:00.315935 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:04.370934 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:09.360001 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:09.552935 UTC 1616623136
24/03/2021 [email protected] 2021-03-24 21:59:22.563 UTC 1616623136
24/03/2021 [email protected] 2021-03-24 21:59:22.800800 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:24.298 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:24.299001 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:24.535800 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:24.536801 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:24.834002 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:25.071802 UTC 1616623136
24/03/2021 [email protected] 2021-03-24 21:59:27.550003 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:27.566 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:27.567001 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:27.614002 UTC 1616623136
24/03/2021 [email protected] 2021-03-24 21:59:27.787803 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:27.803800 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:27.804801 UTC 1616623136
24/03/2021 null 2021-03-24 21:59:27.851802 UTC 1616623136
24/03/2021 [email protected] 2021-03-24 21:59:34.516003 UTC 1616623136
I need to fill all empty email row by session_id I try to use "lead" or "lag" functions but in result I get only several fills:
SELECT event_date,
email,
Lead(email) over (partition by ga_session_id order by event_timestamp ) as ld,
Lag(email) over (partition by ga_session_id order by event_timestamp ) as lg,
event_timestamp,,
ga_session_id
from 'project.dataset.table'
What is the correct way to solve this?
Upvotes: 1
Views: 375
Reputation: 1269553
You can use window functions, but max()
is most appropriate:
select t.*,
max(email) over (partition by ga_session_id) as imputed_email
from `project.dataset.table` t;
Upvotes: 1