GDG
GDG

Reputation: 215

Avoid cartesian product in three tables using LEFT join

I have following 3 tables:-

  1. customers (customer_id, name)
  2. achived (achived_id, customer_id, achieved)
  3. target (target_id, customer_id, target)

On writing the following query, I am getting wrong output due to cartesian product:-

SELECT C.name customer, SUM(A.achieved) achived, SUM(T.target) target
FROM customers C
LEFT JOIN achieved A
ON C.customer_id = A.customer_id
LEFT JOIN target T
ON C.customer_id = T.customer_id 
GROUP BY customer

What would be the right approach of getting correct result using LEFT join?

Upvotes: 1

Views: 3098

Answers (2)

Timo
Timo

Reputation: 8680

I understand that, because you are joining achieved and target both onto customer, you are getting a meaningless cartesian product of achieved and target.

You can remove SUM for a moment to see the cause of the problem. Without SUM, this query gives you a needlessly large result set. With SUM, the repeated values all contribute to the sum, creating a value larger than expected.

You can force MySQL to keep the joined tables apart by joining a constant sequence, as follows.

SELECT C.name customer, SUM(A.achieved) achieved, SUM(T.target) target

FROM customers C

# Split every customer into two rows, joinType=1 vs. joinType=2
LEFT JOIN (SELECT 1 AS joinItem UNION ALL SELECT 2) AS joinItems
ON TRUE

# Join achieved only where joinType=1 (will be NULL elsewhere)
LEFT JOIN achieved A
ON joinItem = 1 AND C.customer_id = A.customer_id

# Join target only where joinType=2 (will be NULL elsewhere)
LEFT JOIN target T
ON joinItem = 2 AND C.customer_id = T.customer_id 

GROUP BY customer

Now, achieved and target will no longer be multiplied with each other.

EXPLAIN should confirm that this is still handled efficiently.

Note: Initially, this creates a customer record per matching 'achieved' (with 'target' data NULL), plus a separate customer record per matching 'target' (with 'achieved' data NULL). Then, however, the GROUP BY in this scenario reunites them back into a single record per customer, with exactly the data we want.

Upvotes: 2

Rahul Narhe
Rahul Narhe

Reputation: 181

You can try this also...

SELECT C.name customer, SUM(A.achieved) achived, SUM(T.target) target
FROM customers C,achieved A,target T
WHERE C.customer_id = A.customer_id 
      AND C.customer_id = T.customer_id 
      AND A.customer_id = T.customer_id
GROUP BY C.customer

Upvotes: -1

Related Questions