Marc H.
Marc H.

Reputation: 57

How to generate date series in BigQuery and fill zeros for missing sales data

I have tried several approaches to get the desired output below

date         | sales
-------------+------
2021-04-01   |   128
-------------+------
2021-04-02   |   0
-------------+------
2021-04-03   |   234
-------------+------
2021-04-04   |   188
-------------+------
2021-04-05   |   432
-------------+------
2021-04-06   |   262
-------------+------
2021-04-07   |   0
-------------+------

At first I've generated a table days in BigQuery that has the following schema

day
-------------+------
2021-04-01
-------------+------
2021-04-02
-------------+------
2021-04-03
-------------+------
2021-04-04
-------------+------
2021-04-05
-------------+------
2021-04-06
-------------+------
2021-04-07
-------------+------

My sales table has the following schema

transaction_id  STRING  
location_id STRING  
date_local  TIMESTAMP   

date_local                     |     transaction_id                     |  location_id
-------------------------------+----------------------------------------+---------------
2021-04-01 13:02:58 UTC        |   55a99f8d-c9fe-4aa0-8a1f-9b193369883a | 164
-------------------------------+----------------------------------------+---------------
2021-04-03 21:22:27 UTC        |   19518c8d-9a6b-46a6-a509-81b4452fe94f | HARBOR1
-------------------------------+----------------------------------------+---------------
2021-04-04 05:17:10 UTC        |   220419a8-8cf9-4404-9394-f341a394fb10 | 5513
-------------------------------+----------------------------------------+---------------

I've tried this query

SELECT day,count(transaction_id) as sales FROM days AS A
LEFT JOIN sales AS B
ON A.day = DATE(B.date_local)
WHERE DATE(B.date_local) BETWEEN "2021-04-01" AND CURRENT_DATE() AND location_id = '164' GROUP BY day ORDER BY day 

but still get a result like this

day            sales
-------------+------
2021-04-01   |  1
-------------+------

It's missing all the other dates in April.

I was assuming NULL values for the sales column after I left joined the sales table.

Am I completely on the wrong track? It would be great if someone can help me out.

Upvotes: 1

Views: 403

Answers (2)

Veikko
Veikko

Reputation: 3620

You have the left join with tables days and sales correctly set, but your where part will filter only the rows based on date in sales table (B.date_local). This date will be null if there is no sales transactions for that day.

Change your where part to filter on day -field from days-table. This will at least bring the days with zero sales to your result. The changed where clause would be like this:

...
WHERE DATE(A.day) BETWEEN "2021-04-01" AND CURRENT_DATE() ...

Upvotes: 1

Sergey Geron
Sergey Geron

Reputation: 10232

Try removing AND location_id = '164'

Upvotes: 1

Related Questions