Reputation: 9212
3 tables.
table_customers - customer_id, name
table_orders - order_id, customer_id, order_datetime
table_wallet - customer_id, amount, type // type 1- credit, type 2- debit
Need to get all customers, their total balance, and their last order date and order id. If customer have not placed any return order date as 0000-00-00 and order id as 0.
This is my query.
SELECT
C.customer_id,
C.name,
COALESCE( SUM(CASE WHEN type = 2 THEN -W.amount ELSE W.amount END), 0) AS value,
COALESCE( max( O.order_id ) , '0' ) AS last_order_id,
COALESCE( max( date( O.order_datetime ) ) , '0000-00-00' ) AS last_order_date
FROM
table_customers as C
LEFT JOIN
table_wallet as W
ON C.customer_id = W.customer_id
LEFT JOIN
table_orders AS O
ON W.customer_id = O.customer_id
group by C.customer_id
ORDER BY C.customer_id
Everything is coming correct except customer's value. From result it seems its getting added multiple times.
I have created the fiddle here. http://sqlfiddle.com/#!9/560f2/1
What is wrong in query? Can anyone help me on this?
Edit: Expected result
customer_id name value last_order_id last_order_date
1 abc 20 3 2016-06-22
2 def 112.55 0 0000-00-00
3 pqrs 0 4 2016-06-15
4 wxyz 0 0 0000-00-00
Upvotes: 1
Views: 54
Reputation: 108651
This is the classic combinatorial explosion problem when you JOIN
tables containing unrelated data.
You need to compute each customer's balance in a subquery. That subquery must yield either one row or zero rows per customer_id. It might look like this. (http://sqlfiddle.com/#!9/560f2/8/0)
SELECT customer_id,
SUM(CASE WHEN type = 2 THEN -amount ELSE amount END) AS value
FROM table_wallet
GROUP BY customer_id
Similarly, you need to retrieve each customer's latest order in a subquery (http://sqlfiddle.com/#!9/560f2/10/0) . Again, it needs either one row or zero rows per customer_id.
SELECT customer_id,
MAX(order_id) AS order_id,
DATE(MAX(order_datetime)) AS order_date
FROM table_orders
GROUP BY customer_id
Then, you can LEFT JOIN
those two subqueries as if they were tables, to your table_customers
. The subqueries are tables; they're virtual tables. (http://sqlfiddle.com/#!9/560f2/12/0)
SELECT c.customer_id,
c.name,
w.value,
o.order_id,
o.order_date
FROM table_customers c
LEFT JOIN (
SELECT customer_id,
SUM(CASE WHEN type = 2 THEN -amount ELSE amount END) AS value
FROM table_wallet
GROUP BY customer_id
) w ON c.customer_id = w.customer_id
LEFT JOIN (
SELECT customer_id,
MAX(order_id) AS order_id,
DATE(MAX(order_datetime)) AS order_date
FROM table_orders
GROUP BY customer_id
) o ON c.customer_id = o.customer_id
Your mistake was this: you joined two tables each with multiple rows for each customer id. For example, a particular customer might have had two orders and three wallet rows. Then, the join results in six rows representing all the possible combinations of wallet and order rows. That's called combinatorial explosion.
The solution I outlined makes sure there's only one row (or maybe no rows) to join for each customer_id, and so eliminates the combinatorial explosion.
Pro tip: Using subqueries like this makes it easy to test your query: you can test each subquery separately.
Upvotes: 2
Reputation: 679
To further illustrate from the previous answers, if we simply remove your group by statement, you can easily see why you are double counting. The following code:
SELECT
C.*,
O.order_id, O.order_datetime,
W.amount, W.type
FROM
table_customers as C
LEFT JOIN
table_wallet as W
ON C.customer_id = W.customer_id
LEFT JOIN
table_orders AS O
ON W.customer_id = O.customer_id
Will yield the result:
customer_id name order_id order_datetime amount type
1 abc 1 April, 22 2016 23:53:09 20 1
1 abc 2 May, 22 2016 23:53:09 20 1
1 abc 3 June, 22 2016 23:53:09 20 1
2 def (null) (null) 100 1
2 def (null) (null) 12.55 1
3 pqrs (null) (null) (null) (null)
4 wxyz (null) (null) (null) (null)
Note the duplication of Customer ID 1 with amount 20.
Upvotes: 2
Reputation: 44881
The issue is that the join between orders and wallet will produce as many rows as there as orders for each wallet, when you really just want one row per wallet from the order table (since you only use the max values). In your test case you get 3 rows for customer 1 which makes the sum 60 (3*20).
One way to solve this is to change to this:
SELECT
C.customer_id,
C.name,
COALESCE( SUM(CASE WHEN type = 2 THEN -W.amount ELSE W.amount END), 0) AS value,
COALESCE( O.order_id , '0' ) AS last_order_id,
COALESCE( DATE( O.order_datetime ) , '0000-00-00' ) AS last_order_date
FROM table_customers AS C
LEFT JOIN table_wallet AS W ON C.customer_id = W.customer_id
LEFT JOIN (
SELECT
customer_id,
MAX(order_id) AS order_id,
MAX(order_datetime) AS order_datetime
FROM table_orders
GROUP BY customer_id
) AS O ON c.customer_id = O.customer_id
GROUP BY C.customer_id
ORDER BY C.customer_id
As you see the orders table is replaced by a derived table that gets you one row per customer.
Running the query above gets you the following result:
| customer_id | name | value | last_order_id | last_order_date |
|-------------|------|--------|---------------|-----------------|
| 1 | abc | 20 | 3 | 2016-06-22 |
| 2 | def | 112.55 | 0 | 0000-00-00 |
| 3 | pqrs | 0 | 4 | 2016-06-15 |
| 4 | wxyz | 0 | 0 | 0000-00-00 |
Upvotes: 2