nshabi
nshabi

Reputation: 63

How can I SUM two tables with a join without duplicate data?

I can't do a SUM properly between two tables and I don't understand the reason. If someone help me, I'll be very greatful.

I have this first query really simple, it works well. The result for 22/01/2020 is Cost="252.263602". This is the correct answer.

SELECT
  Date,
  AdGroupId,
  SUM(A1.Cost)/1000000 AS Cost
FROM
  `table` A1
GROUP BY
  1, 2
ORDER BY
  Date DESC

But I need to join this table with another one to get more information. So to do that, I have this query:

SELECT
  A1.Date,
  A1.AdGroupId,
  SUM(V1.VideoViews) AS VideoViews,
  SUM(A1.Cost)/1000000 AS Cost
FROM
  `table` A1
INNER JOIN
  `table2` V1
ON
 (A1.Date = V1.Date
 AND A1.AdGroupId = V1.AdGroupId)
GROUP BY
  1, 2
ORDER BY
  Date DESC

What is the result for 22/01/2020? Cost = "1009.054408". It's wrong...

I have tried with different JOINS, changing ON with USING... and It doesn't work.

I have tried to join both tables without SUM and it works well. I get info and I can join more columns from both tables.

What am I doing to get a wrong error in SUM?

Thanks so much!

Upvotes: 0

Views: 2806

Answers (2)

nshabi
nshabi

Reputation: 63

I solved my problem. I copy the code here just in case someone need it.

thanks.

SELECT
  A1.Date,
  A1.AdGroupId,
  Cost,
  Views
FROM (
    SELECT
      Date,
      AdGroupId,
      SUM(Cost)/1000000 AS Cost
    FROM
      `table1`
      GROUP BY
      Date,
      AdGroupId) A1
  LEFT JOIN (
    SELECT
      Date,
      AdGroupId,
      SUM(VideoViews) AS Views
    FROM
      `table2`
       GROUP BY
      Date,
      AdGroupId) V1
ON
  A1.Date = V1.Date
  AND A1.AdGroupId = V1.AdGroupId
ORDER BY
  Date DESC

Upvotes: 1

GMB
GMB

Reputation: 222582

This is a cardinality problem. You have more than you one record per group in both tables, so as a result the same value is sumed more than once. You can exhibit the problem by running the join query without aggregation.

One typical solution is to aggregate in subqueries, then join:

SELECT
    A1.Date,
    A1.AdGroupId
    A1.Cost
    V1.VideoViews
FROM (
    SELECT
        Date,
        AdGroupId,
        SUM(Cost)/1000000 AS Cost
    FROM `table`
    GROUP BY 1, 2
) A1
INNER JOIN (
    SELECT
        Date,
        AdGroupId,
        SUM(VideoViews) AS VideoViews
    FROM `table2`
    GROUP BY 1, 2
) V1
    ON A1.Date = V1.Date
    AND A1.AdGroupId = V1.AdGroupId


Upvotes: 3

Related Questions