Gross
Gross

Reputation: 165

Google BigQuery SQL: Fill in empty fields according to another column

I have a table from Google Analytics 4. Some of the emails are not shows and I can identify that it is the same user according to the same session id.

event_date  email                                      event_timestamp          ga_session_id
24/03/2021  null                                 2021-03-24 21:59:00.315935 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:04.370934 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:09.360001 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:09.552935 UTC     1616623136
24/03/2021  [email protected]  2021-03-24 21:59:22.563 UTC        1616623136
24/03/2021  [email protected]  2021-03-24 21:59:22.800800 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:24.298 UTC        1616623136
24/03/2021  null                                 2021-03-24 21:59:24.299001 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:24.535800 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:24.536801 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:24.834002 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:25.071802 UTC     1616623136
24/03/2021  [email protected]  2021-03-24 21:59:27.550003 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:27.566 UTC        1616623136
24/03/2021  null                                 2021-03-24 21:59:27.567001 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:27.614002 UTC     1616623136
24/03/2021  [email protected]  2021-03-24 21:59:27.787803 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:27.803800 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:27.804801 UTC     1616623136
24/03/2021  null                                 2021-03-24 21:59:27.851802 UTC     1616623136
24/03/2021  [email protected]  2021-03-24 21:59:34.516003 UTC     1616623136

I need to fill all empty email row by session_id I try to use "lead" or "lag" functions but in result I get only several fills:

SELECT event_date,
email,
Lead(email) over (partition by ga_session_id order by event_timestamp ) as ld,
Lag(email) over (partition by ga_session_id order by event_timestamp ) as lg,
event_timestamp,,
ga_session_id
from 'project.dataset.table'

What is the correct way to solve this?

Upvotes: 1

Views: 375

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269553

You can use window functions, but max() is most appropriate:

select t.*,
       max(email) over (partition by ga_session_id) as imputed_email
from `project.dataset.table` t;

Upvotes: 1

Related Questions