Reputation: 55
I've got a slow performing query. I know using a dependent subquery is bad, but I can't think of another way to get the data I want.
Essentially, I want to flag customers who have at least 50 invoices in the past 6 months, but no invoices this month.
This is what I have currently:
select
Customer.name,
Customer.id,
Customer.latitude,
Customer.longitude
from
Customer
where
EXISTS (
SELECT
*
FROM
Invoice_Header
WHERE
Invoice_Header.inv_date BETWEEN '2011-03-02' AND '2011-10-02'
AND
Invoice_Header.account_number = Customer.account_number
HAVING COUNT(invoice_num) > 50
)
AND NOT EXISTS (
SELECT *
FROM
Invoice_Header
WHERE
InvHead.inv_date > '2011-10-02'
AND
InvHead.account_number = Customer.account_number
)
Group by name;
Customer table has about 12k record, Invoice_Header has about 2mill records.
I have indexes on inv_date, account_number (in both tables).
Any suggestions for how to speed this up would be appreciated.
Upvotes: 1
Views: 90
Reputation: 6032
select
C.name,
C.id,
C.latitude,
C.longitude,
I.account_number,
count( IF(I.inv_date>='2011-03-02' AND I.inv_date <='2011-10-02',I.inv_date,NULL )) as inv_count_6,
count( IF(I.inv_date > '2011-10-02',I.inv_date,NULL )) as inv_count_1
from Customer C
LEFT JOIN Invoice_Header I
ON C.account_number = I.account_number
GROUP BY C.id, I.account_number
HAVING inv_count_6 >= 50 AND inv_count_1=0
WHERE I.inv_date BETWEEN '2011-03-02' AND '2011-10-02'
Notes:
1.The invoices is AT LEAST 50. so the condition is >=50 not >50.
2.You have to add index to the column inv_date
Upvotes: 2
Reputation: 5389
I would suggest:
SELECT
c.name,
c.id,
c.latitude,
c.longitude
FROM
Customer AS c
INNER JOIN (
SELECT account_number, count(*) AS invoice_count
FROM Invoice_Header
WHERE inv_date >= '2011-03-02' AND inv_date <= '2011-10-02'
GROUP BY account_number
) AS lsm
ON c.account_number = lsm.account_number
LEFT JOIN (
SELECT account_number, count(*) AS invoice_count
FROM Invoice_Header
WHERE inv_date > '2011-10-02'
GROUP BY account_number
) AS lm
ON c.account_number = lm.account_number
WHERE
lsm.invoice_count >= 50
AND IFNULL(lm.invoice_count, 0) = 0
Upvotes: 2
Reputation: 60498
This should eliminate the correlated subqueries and be significantly faster:
SELECT c.name, c.id, c.latitude, c.longitude
FROM Customer c
INNER JOIN (
SELECT account_number
FROM Invoice_Header ih
WHERE ih.inv_date BETWEEN '2011-03-02' AND '2011-10-02'
GROUP BY account_number
HAVING COUNT(*) > 50
MINUS
SELECT DISTINCT account_number
FROM Invoice_Header ih
WHERE ih.inv_date > '2011-10-02'
) tbl
ON tbl.account_number = c.account_number
Upvotes: 3