Pak Hang Leung
Pak Hang Leung

Reputation: 389

SQL - How to combine 2 dates from different tables with joins without error

Im my company, there are 2 tables to carry the campaign information and the webshop information.

Basic information

In the campaign table it carries the information like as follows:

CAMPAIGN_NAME  CREATION_DATE  NUM_DELIVERED  NUM_ERRORS
Promotion 101   2013-01-05      100,000        100
Promotion 105   2013-01-05      135,000        200
Promotion 104   2013-01-05      125,000         0
Promotion 103   2013-01-06      50,000          0

In the webshop it carries the information like this

VISIT_KEY    VISIT_AT  .....
 100200     2013-01-05
 105235     2013-01-05
 103050     2013-01-05

Desired outcome

We want to build up a table to show the effect for each day like

CREATION_DATE    VISIT_AT   NUM_DELIVERED  NUM_VISITS
 2013-01-05     2013-01-05    260,000        30,000
 2013-01-06     2013-01-06     50,000          0 

Before and after approach
to collect the information, before we were using the using the union method, which first do the aggregation in separate table and UNION ALL to another,

SELECT 
   campaign_date,
   visit_date,
   SUM(delivered),
   SUM(visits)
FROM
    ((Select
       CREATION_DATE::DATE as campaign_date,
       '1970-01-01'::DATE as visit_date
       SUM(NUM_DELIVERED) as delivered
       0 AS visits
    FROM 
       campaign
    GROUP BY 1,2)
    UNION ALL
    (Select
       '1970-01-01'::Date AS campaign_date,
       VISIT_AT::DATE AS visit_date
       0 AS delivered
       COUNT(VISIT_KEY) AS visits
    FROM 
       campaign
    GROUP BY 1,2))
 GROUP BY 1,2

which look like this

campaign_date visit_date   delivered      visits
2013-01-05    1970-01-01    260,000          0
1970-01-01    2013-01-05       0          30,000
2013-01-06    1970-01-01     50,000          0     

Now I tried to combine with a left join on campaign.CREATION_DATE = webshop.VISIT_AT like this:

Select 
  campaign.CREATION_DATE as campaign_date, 
  webshop.VISIT_AT as visits,
  SUM(campaign.NUM_DELIVERED) as delivered,
  COUNT(webshop.VISIT_KEY) AS visits
FROM 
  webshop LEFT JOIN campaign ON webshop.VISIT_AT = campaign.CREATION_DATE

But the figure is totally different....

Question

1, What is the possible error in this query? Coz I am supposing to get the same information and the same result should be expected...

2, How can I achieve the desired outcome?

For your reference, I am using Amazon redshift.

Many thanks for your help in advance and have a nice weekend!

Upvotes: 2

Views: 2802

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1271151

One method uses union all and group by:

select dte, sum(num_delivered) as num_delivered, sum(num_visits) as num_visits
from ((select creation_date as dte, sum(num_delivered) as num_delivered, 0 as num_visits
       from campaign
       group by creation_date
      ) union all
      (select visit_at, 0 as num_delivered, sum(num_visits) as num_visits
       from webshop
       group by visit_at
      )
     ) cw
group by dte
order by dte;

I see no reason to have two date columns.

An alternative is a full outer join after aggregating:

select coalesce(creation_date, visit_at) as dte,
       coalesce(num_delivered, 0) as num_delivered, 
       coalesce(num_visits, 0) as num_visits
from (select creation_date, sum(num_delivered) as num_delivered, 0 as num_visits
      from campaign
      group by creation_date
     ) c full outer join
     (select visit_at, 0 as num_delivered, sum(num_visits) as num_visits
      from webshop
      group by visit_at
     ) 
     on w.visit_at = c.creation_dte
order by dte;

Upvotes: 2

Nishant Gupta
Nishant Gupta

Reputation: 3656

Solution to your problem:

USING DISTINCT

SELECT DISTINCT c.Creation_Date, 
       c.Creation_Date AS Visit_At, 
       c.Num_Delivered, 
       c.Num_Visits
FROM (
      SELECT c.Creation_Date, 
             SUM(c.Num_Delivered) AS Num_Delivered, 
             SUM(c.Num_Errors) AS Num_Visits
      FROM Campaign AS c
      GROUP BY c.Creation_Date
     ) AS c
LEFT JOIN Webshop AS w
ON c.Creation_Date = w.Visit_At

OR

You can use GROUP BY Instead of DISTINCT:

SELECT c.Creation_Date, c.Creation_Date AS Visit_At, c.Num_Delivered, c.Num_Visits
FROM (
      SELECT c.Creation_Date, SUM(c.Num_Delivered) AS Num_Delivered, SUM(c.Num_Errors) AS Num_Visits
      FROM Campaign AS c
      GROUP BY c.Creation_Date
     ) AS c
LEFT JOIN Webshop AS w
ON c.Creation_Date = w.Visit_At
GROUP BY c.Creation_Date, c.Creation_Date , c.Num_Delivered, c.Num_Visits

OUTPUT:

Creation_Date   Visit_At     Num_Delivered  Num_Visits
2013-01-05      2013-01-05   360000          30000
2013-01-06      2013-01-06   50000           0

Link To the demo:

http://sqlfiddle.com/#!9/22fe0/1

Upvotes: 2

Related Questions