takotsubo
takotsubo

Reputation: 746

Group By two tables and then JOIN them

I have two tables:

Table 1 with PRIMARY KEY (campaign_id, type, keyword, report_date)

campaign_id type keyword impressions date
1 link shower gel 2 2021-05-01
1 search engine gel for body 5 2021-05-01
1 link fragrant gel 1 2021-05-01

Table 2 with PRIMARY KEY (campaign_id, sku, report_date)

campaign_id sku product_name orders date
1 3516 Product 1 1 2021-05-01
1 87218 Product 2 2 2021-05-01
1 4478 Product 3 4 2021-05-01

I want to JOIN them and GROUP by campaign_id and date to get next result:

campaign_id impressions orders date
1 8 7 2021-05-01

I wrote SQL:

CREATE OR REPLACE VIEW result.all_stats AS
SELECT
    t1.campaign_id,
    t1.report_date,
    sum(t1.clicks) AS clicks,
    sum(t2.orders) AS orders,
FROM
    result.table1 t1
    LEFT JOIN result.table2 t2 ON t1.campaign_id = t2.campaign_id
    AND t1.report_date = t2.report_date
GROUP BY
    t1.campaign_id,
    t1.report_date;

But it actually returns:

campaign_id impressions orders date
1 24 21 2021-05-01

So script joined result 3 times because 3 rows in table 2. I need to GROUP first table by campaign_id + date and GROUP second table by campaign_id + date and then JOIN them. How to fix it?

Upvotes: 0

Views: 147

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269763

I would suggest aggregating before joining:

SELECT campaign_id, report_date, c.clicks, o.orders
FROM (SELECT campaign_id, report_date, sum(clicks) as clicks
      FROM result.table1 t1
      GROUP BY campaign_id, report_date
     ) c LEFT JOIN
     (SELECT campaign_id, report_date, SUM(orders) as orders
      FROM result.table2 t2
      GROUP BY campaign_id, report_date
     ) o
     USING (campaign_id, report_date);

Note: You might want to consider FULL JOIN to take care of situations where you have orders for a campaign that are on a date with no clicks.

Upvotes: 1

eshirvana
eshirvana

Reputation: 24568

here is one way :

SELECT
    t1.campaign_id,
    t1.report_date,
    sum(t1.clicks) AS clicks,
    t2.orders AS orders,
FROM
    result.table1 t1
    LEFT JOIN lateral(
        select sum(t2.orders) orders
        from result.table2 t2 
        where t1.campaign_id = t2.campaign_id
        AND t1.report_date = t2.report_date
    ) t2 on true
GROUP BY
    t1.campaign_id,
    t1.report_date,
    t2.orders

Upvotes: 1

Related Questions