Reputation: 28795
I'm trying to merge two datasets - one of sales targets and another of actual sales, both by day and market (US/UK).
To do this, I'm using a third table which is using GENERATE_DATE_ARRAY
to create a master list of dates to report on - so that I don't end up with gaps where there's no target set and no sales reported.
I've found I'm getting sales counted twice, so have reduced my data and query down to a reproducible state:
#standardSQL
WITH dates AS (
SELECT day FROM UNNEST(GENERATE_DATE_ARRAY(DATE '2018-07-05', '2018-07-09', INTERVAL 1 DAY)) AS day
),
targets AS (
SELECT DATE '2018-07-06' AS day, 'UK' AS Market, NUMERIC '2.4' AS quantity
UNION ALL SELECT '2018-07-06', "US", 8.4
UNION ALL SELECT '2018-07-06', "US", 1.2
UNION ALL SELECT '2018-07-08', "UK", 3.0
UNION ALL SELECT '2018-07-08', "US", 10.9
),
sales AS (
SELECT DATE '2018-07-08' AS day, 'UK' AS Market, 4 AS quantity
UNION ALL SELECT '2018-07-06', 'US', 15
)
SELECT
dates.day AS day,
targets.market AS market,
SUM(targets.quantity) AS targetQuantity,
SUM(sales.quantity) AS quantity
FROM dates
LEFT JOIN targets
ON dates.day = CAST(targets.day AS DATE)
LEFT JOIN sales
ON dates.day = CAST(sales.day AS DATE) AND targets.market = sales.market
GROUP BY day, market
ORDER BY day, market
This gives the following results:
The results show that the reported sales quantity for July 6th (row #3) is 30, despite being 15 in the data.
It's happening when there are two rows for that date & market in the targets
data, but I can't work out how to code for this.
Thanks for your help!
Upvotes: 0
Views: 603
Reputation: 172993
Below should work. The idea is to pre-aggregate both sales and targets tables to avoid dups
#standardSQL
WITH dates AS (
SELECT day FROM UNNEST(GENERATE_DATE_ARRAY(DATE '2018-07-05', '2018-07-09', INTERVAL 1 DAY)) AS day
), targets AS (
SELECT DATE '2018-07-06' AS day, 'UK' AS Market, NUMERIC '2.4' AS quantity
UNION ALL SELECT '2018-07-06', "US", 8.4
UNION ALL SELECT '2018-07-06', "US", 1.2
UNION ALL SELECT '2018-07-08', "UK", 3.0
UNION ALL SELECT '2018-07-08', "US", 10.9
), sales AS (
SELECT DATE '2018-07-08' AS day, 'UK' AS Market, 4 AS quantity
UNION ALL SELECT '2018-07-06', 'US', 15
)
SELECT
dates.day AS day,
t.market AS market,
targetQuantity,
quantity
FROM dates
LEFT JOIN (SELECT day, market, SUM(quantity) AS targetQuantity FROM targets GROUP BY day, market) t
ON dates.day = CAST(t.day AS DATE)
LEFT JOIN (SELECT day, market, SUM(quantity) AS quantity FROM sales GROUP BY day, market) s
ON dates.day = CAST(s.day AS DATE) AND t.market = s.market
ORDER BY day, market
Upvotes: 2