Reputation: 1055
I am trying to get daily sum of sales from a google big-query table. I used following code for that.
select Day(InvoiceDate) date, Sum(InvoiceAmount) sales from test_gmail_com.sales
where year(InvoiceDate) = Year(current_date()) and
Month(InvoiceDate) = Month(current_date())
group by date order by date
From the above query it gives only the sum of sales daily which were in the table. There is a chance that some days do not have any sales. For those kind of situations, I need to get the date and sum should be 0. As an example, in every month should 30 0r 31 rows with sum of sales. Examples show below. 4th day of the month does not have a sales. So its sum should be 0.
date | sales
-----+------
1 | 259
-----+------
2 | 359
-----+------
3 | 45
-----+------
4 | 0
-----+------
5 | 156
Is it possible to do in Big-query? Basically date column should be a series from 1 - 28/29/30 or 31st depending on the month of the year
Upvotes: 21
Views: 41350
Reputation: 730
Generting a list of dates and then joining whatever table you need on top seems the easiest. I used the generate_date_array
+ unnest
and it looks quite clean.
To generate a list of days (one day per row):
SELECT
*
FROM
UNNEST(GENERATE_DATE_ARRAY('2018-10-01', '2020-09-30', INTERVAL 1 DAY)) AS example
Upvotes: 63
Reputation: 528
Using the Standard SQL dialect and the generate_array
function to simplify the code:
WITH serialnum AS (
SELECT
sn
FROM
UNNEST(GENERATE_ARRAY(0,
DATE_DIFF(DATE_ADD(DATE_TRUNC(CURRENT_DATE()
, MONTH)
, INTERVAL 1 MONTH)
, DATE_TRUNC(CURRENT_DATE(), MONTH)
, DAY) - 1)
) AS sn
), date_seq AS (
SELECT
DATE_ADD(DATE_TRUNC(CURRENT_DATE(), MONTH),
INTERVAL(sn) DAY) AS this_day
FROM
serialnum
)
SELECT
Day(InvoiceDate) date
, Sum(IFNULL(InvoiceAmount, 0)) sales
FROM
date_seq
LEFT JOIN
test_gmail_com.sales
ON
date_seq.this_day = DAY(test_gmail_com.sales.InvoiceDate)
WHERE
year(InvoiceDate) = Year(current_date())
and
Month(InvoiceDate) = Month(current_date())
GROUP BY
date
ORDER BY
date
;
UPDATE
Or, simpler still using the generate_date_array
function:
WITH date_seq AS (
SELECT
GENERATE_DATE_ARRAY(DATE_TRUNC(CURRENT_DATE(), MONTH),
DATE_ADD(DATE_ADD(DATE_TRUNC(CURRENT_DATE(), MONTH)
, INTERVAL 1 MONTH)
, INTERVAL -1 DAY)
, INTERVAL 1 DAY)
AS this_day
)
SELECT
Day(InvoiceDate) date
, Sum(IFNULL(InvoiceAmount, 0)) sales
FROM
date_seq
LEFT JOIN
test_gmail_com.sales
ON
date_seq.this_day = DAY(test_gmail_com.sales.InvoiceDate)
WHERE
year(InvoiceDate) = Year(current_date())
and
Month(InvoiceDate) = Month(current_date())
GROUP BY
date
ORDER BY
date
;
Upvotes: 4
Reputation: 1311
For Standard SQL
WITH
splitted AS (
SELECT
*
FROM
UNNEST( SPLIT(RPAD('',
1 + DATE_DIFF(CURRENT_DATE(), DATE("2015-06-01"), DAY),
'.'),''))),
with_row_numbers AS (
SELECT
ROW_NUMBER() OVER() AS pos,
*
FROM
splitted),
calendar_day AS (
SELECT
DATE_ADD(DATE("2015-06-01"), INTERVAL (pos - 1) DAY) AS day
FROM
with_row_numbers)
SELECT
*
FROM
calendar_day
ORDER BY
day DESC
Upvotes: 0
Reputation: 172993
You can use below to generate on fly all dates in given range (in below example it is all dates from 2015-06-01 till CURRENT_DATE() - by changing those you can control which dates range to generate)
SELECT DATE(DATE_ADD(TIMESTAMP("2015-06-01"), pos - 1, "DAY")) AS calendar_day
FROM (
SELECT ROW_NUMBER() OVER() AS pos, *
FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + DATEDIFF(TIMESTAMP(CURRENT_DATE()), TIMESTAMP("2015-06-01")), '.'),'') AS h
FROM (SELECT NULL)),h
)))
so, now - you can use it with LEFT JOIN with your table to have all dates accounted. See potential example below
SELECT
calendar_day,
IFNULL(sales, 0) AS sales
FROM (
SELECT DATE(DATE_ADD(TIMESTAMP("2015-06-01"), pos - 1, "DAY")) AS calendar_day
FROM (
SELECT ROW_NUMBER() OVER() AS pos, *
FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + DATEDIFF(TIMESTAMP(CURRENT_DATE()), TIMESTAMP("2015-06-01")), '.'),'') AS h
FROM (SELECT NULL)),h
)))
) AS all_dates
LEFT JOIN (
SELECT DAY(InvoiceDate) DATE, SUM(InvoiceAmount) sales
FROM test_gmail_com.sales
WHERE YEAR(InvoiceDate) = YEAR(CURRENT_DATE()) AND
MONTH(InvoiceDate) = MONTH(CURRENT_DATE())
GROUP BY DATE
)
ON DATE = calendar_day
I wanna need to get previous months sales
Below gives all days of previous month
SELECT DATE(DATE_ADD(DATE_ADD(DATE_ADD(CURRENT_DATE(), -1, "MONTH"), 1 - DAY(CURRENT_DATE()), "DAY"), pos - 1, "DAY")) AS calendar_day
FROM (
SELECT ROW_NUMBER() OVER() AS pos, *
FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + DATEDIFF(DATE_ADD(CURRENT_DATE(), - DAY(CURRENT_DATE()), "DAY"), DATE_ADD(DATE_ADD(CURRENT_DATE(), -1, "MONTH"), 1 - DAY(CURRENT_DATE()), "DAY")), '.'),'') AS h
FROM (SELECT NULL)),h
)))
Upvotes: 12
Reputation: 1244
For these purposes it is practical to have a 'calendar' table, a table that just lists all the days within a certain range. For your specific question, it would suffice to have a table with the numbers 1 to 31. A quick way to get this table is to make a spreadsheet with these numbers, save it as a csv file and import this file into BigQuery as a table.
You then left outer join
your result set onto this table, with ifnull(sales,0) as sales
.
If you want the number of days per month (28--31) to be right, you basically have two options. Either you create a proper calendar table that covers several years and that you join on using year, month and day. Or you use the simple table with numbers 1--31 and remove numbers based on the month and the year.
Upvotes: 1