Tom
Tom

Reputation: 55

MySQL Query optimization required

I've got a slow performing query. I know using a dependent subquery is bad, but I can't think of another way to get the data I want.

Essentially, I want to flag customers who have at least 50 invoices in the past 6 months, but no invoices this month.

This is what I have currently:

select
    Customer.name,
    Customer.id,
    Customer.latitude,
    Customer.longitude
from
    Customer
where
    EXISTS (
        SELECT 
            *
        FROM
            Invoice_Header
        WHERE
            Invoice_Header.inv_date BETWEEN '2011-03-02' AND '2011-10-02'
        AND
            Invoice_Header.account_number = Customer.account_number
        HAVING COUNT(invoice_num) > 50
    )
    AND NOT EXISTS (
        SELECT * 
        FROM 
            Invoice_Header 
        WHERE
            InvHead.inv_date > '2011-10-02'
        AND
            InvHead.account_number = Customer.account_number
    )
Group by name;

Customer table has about 12k record, Invoice_Header has about 2mill records.

I have indexes on inv_date, account_number (in both tables).

Any suggestions for how to speed this up would be appreciated.

Upvotes: 1

Views: 90

Answers (4)

Imdad
Imdad

Reputation: 6032

select
    C.name,
    C.id,
    C.latitude,
    C.longitude,
    I.account_number,         
    count( IF(I.inv_date>='2011-03-02' AND I.inv_date <='2011-10-02',I.inv_date,NULL )) as inv_count_6,
    count( IF(I.inv_date > '2011-10-02',I.inv_date,NULL )) as inv_count_1

from Customer C
LEFT JOIN Invoice_Header I
ON C.account_number = I.account_number 

GROUP BY C.id, I.account_number
HAVING inv_count_6 >= 50 AND inv_count_1=0
WHERE I.inv_date  BETWEEN '2011-03-02' AND '2011-10-02'

Notes:

1.The invoices is AT LEAST 50. so the condition is >=50 not >50.

2.You have to add index to the column inv_date

Upvotes: 2

Xint0
Xint0

Reputation: 5389

I would suggest:

SELECT
    c.name,
    c.id,
    c.latitude,
    c.longitude
FROM
    Customer AS c
    INNER JOIN (
        SELECT account_number, count(*) AS invoice_count
        FROM Invoice_Header
        WHERE inv_date >= '2011-03-02' AND inv_date <= '2011-10-02'
        GROUP BY account_number
    ) AS lsm
        ON c.account_number = lsm.account_number
    LEFT JOIN (
        SELECT account_number, count(*) AS invoice_count
        FROM Invoice_Header
        WHERE inv_date > '2011-10-02'
        GROUP BY account_number
    ) AS lm
        ON c.account_number = lm.account_number
    WHERE
        lsm.invoice_count >= 50
        AND IFNULL(lm.invoice_count, 0) = 0

Upvotes: 2

Eric Petroelje
Eric Petroelje

Reputation: 60498

This should eliminate the correlated subqueries and be significantly faster:

SELECT c.name, c.id, c.latitude, c.longitude
FROM Customer c
INNER JOIN (
  SELECT account_number
  FROM Invoice_Header ih
  WHERE ih.inv_date BETWEEN '2011-03-02' AND '2011-10-02'
  GROUP BY account_number
  HAVING COUNT(*) > 50
  MINUS
  SELECT DISTINCT account_number
  FROM Invoice_Header ih
  WHERE ih.inv_date > '2011-10-02'
) tbl
ON tbl.account_number = c.account_number

Upvotes: 3

aleroot
aleroot

Reputation: 72636

Try run your query with explain and see if other indexs are needed .

Upvotes: 1

Related Questions