Reputation: 389
Im my company, there are 2 tables to carry the campaign information and the webshop information.
Basic information
In the campaign table it carries the information like as follows:
CAMPAIGN_NAME CREATION_DATE NUM_DELIVERED NUM_ERRORS
Promotion 101 2013-01-05 100,000 100
Promotion 105 2013-01-05 135,000 200
Promotion 104 2013-01-05 125,000 0
Promotion 103 2013-01-06 50,000 0
In the webshop it carries the information like this
VISIT_KEY VISIT_AT .....
100200 2013-01-05
105235 2013-01-05
103050 2013-01-05
Desired outcome
We want to build up a table to show the effect for each day like
CREATION_DATE VISIT_AT NUM_DELIVERED NUM_VISITS
2013-01-05 2013-01-05 260,000 30,000
2013-01-06 2013-01-06 50,000 0
Before and after approach
to collect the information, before we were using the using the union method, which first do the aggregation in separate table and UNION ALL to another,
SELECT
campaign_date,
visit_date,
SUM(delivered),
SUM(visits)
FROM
((Select
CREATION_DATE::DATE as campaign_date,
'1970-01-01'::DATE as visit_date
SUM(NUM_DELIVERED) as delivered
0 AS visits
FROM
campaign
GROUP BY 1,2)
UNION ALL
(Select
'1970-01-01'::Date AS campaign_date,
VISIT_AT::DATE AS visit_date
0 AS delivered
COUNT(VISIT_KEY) AS visits
FROM
campaign
GROUP BY 1,2))
GROUP BY 1,2
which look like this
campaign_date visit_date delivered visits
2013-01-05 1970-01-01 260,000 0
1970-01-01 2013-01-05 0 30,000
2013-01-06 1970-01-01 50,000 0
Now I tried to combine with a left join on campaign.CREATION_DATE = webshop.VISIT_AT like this:
Select
campaign.CREATION_DATE as campaign_date,
webshop.VISIT_AT as visits,
SUM(campaign.NUM_DELIVERED) as delivered,
COUNT(webshop.VISIT_KEY) AS visits
FROM
webshop LEFT JOIN campaign ON webshop.VISIT_AT = campaign.CREATION_DATE
But the figure is totally different....
Question
1, What is the possible error in this query? Coz I am supposing to get the same information and the same result should be expected...
2, How can I achieve the desired outcome?
For your reference, I am using Amazon redshift.
Many thanks for your help in advance and have a nice weekend!
Upvotes: 2
Views: 2802
Reputation: 1271151
One method uses union all
and group by
:
select dte, sum(num_delivered) as num_delivered, sum(num_visits) as num_visits
from ((select creation_date as dte, sum(num_delivered) as num_delivered, 0 as num_visits
from campaign
group by creation_date
) union all
(select visit_at, 0 as num_delivered, sum(num_visits) as num_visits
from webshop
group by visit_at
)
) cw
group by dte
order by dte;
I see no reason to have two date columns.
An alternative is a full outer join
after aggregating:
select coalesce(creation_date, visit_at) as dte,
coalesce(num_delivered, 0) as num_delivered,
coalesce(num_visits, 0) as num_visits
from (select creation_date, sum(num_delivered) as num_delivered, 0 as num_visits
from campaign
group by creation_date
) c full outer join
(select visit_at, 0 as num_delivered, sum(num_visits) as num_visits
from webshop
group by visit_at
)
on w.visit_at = c.creation_dte
order by dte;
Upvotes: 2
Reputation: 3656
Solution to your problem:
USING DISTINCT
SELECT DISTINCT c.Creation_Date,
c.Creation_Date AS Visit_At,
c.Num_Delivered,
c.Num_Visits
FROM (
SELECT c.Creation_Date,
SUM(c.Num_Delivered) AS Num_Delivered,
SUM(c.Num_Errors) AS Num_Visits
FROM Campaign AS c
GROUP BY c.Creation_Date
) AS c
LEFT JOIN Webshop AS w
ON c.Creation_Date = w.Visit_At
OR
You can use GROUP BY
Instead of DISTINCT
:
SELECT c.Creation_Date, c.Creation_Date AS Visit_At, c.Num_Delivered, c.Num_Visits
FROM (
SELECT c.Creation_Date, SUM(c.Num_Delivered) AS Num_Delivered, SUM(c.Num_Errors) AS Num_Visits
FROM Campaign AS c
GROUP BY c.Creation_Date
) AS c
LEFT JOIN Webshop AS w
ON c.Creation_Date = w.Visit_At
GROUP BY c.Creation_Date, c.Creation_Date , c.Num_Delivered, c.Num_Visits
OUTPUT:
Creation_Date Visit_At Num_Delivered Num_Visits
2013-01-05 2013-01-05 360000 30000
2013-01-06 2013-01-06 50000 0
Link To the demo:
Upvotes: 2