Reputation: 23
If I have a table that logs how many products were sold each day - and does not put in a log if 0 items were sold - how would I go about identifying all of my products had a day of 0 sales?
CafeTable sample
Date Item QuantitySold
Jan 1 coffee 4
Jan 1 tea 1
Jan 2 tea 3
Jan 3 coffee 2
Jan 3 tea 4
Desired output would be something like this. So far I can get to a version that lists each day and quantity sold, but can't figure out if there's a way to also "tag" the product.
Item TotalSold SoldDaily
coffee 6 FALSE
tea 8 TRUE
This is my rough starting point - the JOIN is to try to force all dates to show, but it's still skipping the row for coffee on Jan 2. And from there, I can't figure out how I would potentially then perform the next step - I've tried a few CASE WHENs but not coming together.
with dates as (select distinct date from cafetable)
select date,
product,
sum(quantitysold) as quant_sold,
min(quantitysold) as min_sold
from dates c
left join cafetable d on d.date=c.date
group by 1,2
I don't think I'm even articulating this well. I work with this table a dozen times a day and I feel like this query has fried my brain.
Upvotes: 2
Views: 42
Reputation: 10172
If there are days when there are no sales at all or there are zero sales at the last day of the report then you can use GENERATE_DATE_ARRAY:
WITH test_table AS (
SELECT DATE '2020-01-01' AS Date, 'coffee' AS Item, 4 AS QuantitySold UNION ALL
SELECT DATE '2020-01-01', 'tea', 1 UNION ALL
SELECT DATE '2020-01-02', 'tea', 3 UNION ALL
SELECT DATE '2020-01-03', 'coffee', 2 UNION ALL
SELECT DATE '2020-01-03', 'tea', 4
)
, all_items_table AS (
SELECT DISTINCT Item
FROM test_table
)
SELECT *
FROM all_items_table
CROSS JOIN UNNEST(GENERATE_DATE_ARRAY('2020-01-01', '2020-01-03')) AS Date
LEFT JOIN test_table USING (Date, Item)
WITH test_table AS (
SELECT DATE '2020-01-01' AS Date, 'coffee' AS Item, 4 AS QuantitySold UNION ALL
SELECT DATE '2020-01-01', 'tea', 1 UNION ALL
SELECT DATE '2020-01-02', 'tea', 3 UNION ALL
SELECT DATE '2020-01-03', 'coffee', 2 UNION ALL
SELECT DATE '2020-01-03', 'tea', 4
)
, all_items_table AS (
SELECT DISTINCT Item
FROM test_table
)
SELECT
Item,
LOGICAL_AND(QuantitySold IS NOT NULL) AS SoldDaily
FROM all_items_table
CROSS JOIN UNNEST(GENERATE_DATE_ARRAY('2020-01-01', '2020-01-03')) AS Date
LEFT JOIN test_table USING (Date, Item)
GROUP BY Item
Upvotes: 0
Reputation: 172993
Below is for BigQuery Standard SQL
#standardSQL
select item,
sum(QuantitySold) as TotalSold,
count(*) = date_diff(max(date), min(date), day) + 1 as SoldDaily
from `project.dataset.CafeTable`
group by item
You can test, play with above using sample data from your question as in below example
#standardSQL
with `project.dataset.CafeTable` as (
select date '2020-01-01' date, 'coffee' item, 4 QuantitySold union all
select '2020-01-01', 'tea', 1 union all
select '2020-01-02', 'tea', 3 union all
select '2020-01-03', 'coffee', 2 union all
select '2020-01-03', 'tea', 4
)
select item,
sum(QuantitySold) as TotalSold,
count(*) = date_diff(max(date), min(date), day) + 1 as SoldDaily
from `project.dataset.CafeTable`
group by item
with output
Upvotes: 1
Reputation: 521249
You could use logic here which asserts that all dates be present or not:
Below is for BigQuery Standard SQL:
select
c.date,
d.product,
sum(quantitysold) as quant_sold,
min(quantitysold) as min_sold,
case when count(distinct c.date) = (select count(distinct date) from dates)
then TRUE else FALSE end AS SoldDaily
from dates c
left join cafetable d
on d.date = c.date
group by
1, 2;
Upvotes: 1