Fill values by user in BigQuery

Question

I've got data on how a set of customers spend money on a daily basis, with the following structure in BigQuery:

CREATE TABLE if not EXISTS daily_spend (
  user_id int,
  created_at DATE,
  value float
);

INSERT INTO daily_spend VALUES
(1, '2021-01-01', 0),
(1, '2021-01-02', 1),
(1, '2021-01-04', 1),
(1, '2021-01-05', 2),
(1, '2021-01-07', 5),
(2, '2021-01-01', 5),
(2, '2021-01-03', 0),
(2, '2021-01-04', 1),
(2, '2021-01-06', 2);

I'd like to complete the data for each user by putting 0's in the days the users didn't spent any money, only including days between their first and last days spending money.

So, the output table in this example would have the following values:

(1, '2021-01-01', 0),
(1, '2021-01-02', 1),
(1, '2021-01-03', 0),
(1, '2021-01-04', 1),
(1, '2021-01-05', 2),
(1, '2021-01-06', 0),
(1, '2021-01-07', 5),
(2, '2021-01-01', 5),
(2, '2021-01-02', 0),
(2, '2021-01-03', 0),
(2, '2021-01-04', 1),
(2, '2021-01-05', 0),
(2, '2021-01-06', 2)

What's the simplest way of doing this in BigQuery?

Mikhail Berlyant · Accepted Answer

Use below

select user_id, created_at, ifnull(value, 0) value
from (
  select user_id, min(created_at) min_date, max(created_at) max_date
  from daily_spend
  group by user_id
), unnest(generate_date_array(min_date, max_date)) created_at
left join daily_spend
using(user_id, created_at)

If applied to sample data in your question - output is

Fill values by user in BigQuery

Answers (1)

Related Questions