RNK
RNK

Reputation: 5792

mysql query performance improve

I have performance issue in below query:

SELECT t.local_branch_revenue, t.total_payment,

    (SELECT SUM(IF(cpo.real_account_type = 'HQ', 0, cpo.payment_amount)) AS cpo_payment_amount
        FROM customer_payment_options cpo
        WHERE tran_id=t.id
        AND cpo.payment_type != 'WALLET' AND cpo.payment_type != 'REWARD_CREDIT'
        GROUP BY cpo.tran_id)
    as cpo_payment_amount,

    b.ben_firstname, b.ben_lastname
    FROM transaction t
    LEFT JOIN beneficiary b
    ON b.id=t.ben_id
    WHERE t.local_branch_id='31'
    AND DATE(t.date_added) < '2016-04-07'
    AND source_country_id='40'
    AND t.transaction_status != 'CANCELLED'

EXPLAIN

 +----+--------------------+-------+--------+----------------------------------------+----------------------------------------+---------+-----------------+------+-------------+
| id | select_type        | table | type   | possible_keys                          | key                                    | key_len | ref             | rows | Extra       |
+----+--------------------+-------+--------+----------------------------------------+----------------------------------------+---------+-----------------+------+-------------+
| 1  | PRIMARY            | t     | ref    | local_branch_id,source_country_id      | local_branch_id                        | 5       | const           | 2    | Using where |
+----+--------------------+-------+--------+----------------------------------------+----------------------------------------+---------+-----------------+------+-------------+
| 1  | PRIMARY            | b     | eq_ref | PRIMARY                                | PRIMARY                                | 8       | mtesdb.t.ben_id | 1    |             |
+----+--------------------+-------+--------+----------------------------------------+----------------------------------------+---------+-----------------+------+-------------+
| 2  | DEPENDENT SUBQUERY | cpo   | ref    | tran_id_payment_type_real_account_type | tran_id_payment_type_real_account_type | 9       | mtesdb.t.id     | 1    | Using where |
+----+--------------------+-------+--------+----------------------------------------+----------------------------------------+---------+-----------------+------+-------------+

As you can see, it's using indexes from possible key. But still query takes about 13 sec.

I also have index over transaction table: (ben_id, company_id, source_country_id, date_added, tran_owner). But, it's not even coming in possible keys section.

Let me know if you need table schema.

What am I missing here?

Upvotes: 1

Views: 56

Answers (2)

Rick James
Rick James

Reputation: 142366

WHERE t.local_branch_id='31'
AND DATE(t.date_added) < '2016-04-07'
AND source_country_id='40'

Change that date test to simply t.date_added < '2016-04-07' ! Otherwise the following index suggestions won't work.

What table is source_country_id in?? If it is in t, then you need INDEX(local_branch_id, source_country_id, date_added). If it is not in t, then INDEX(local_branch_id, date_added).

Please provide SHOW CREATE TABLE if you need further discussion.

Upvotes: 0

O. Jones
O. Jones

Reputation: 108766

Dependent subqueries don't perform very well in MySQL ... the query planner doesn't transform them to JOINed subqueries efficiently. (They're OK in Oracle and SQL Server, but who has the money for those?) So, a good bet for you is to refactor your query to eliminate the dependent subquery.

Here's your subquery. Let's refactor it as an independent subquery. We'll get rid of the WHERE tran_id=t.id and move it, later, to an ON clause.

             SELECT tran_id,
                    SUM(IF(real_account_type = 'HQ',
                           0, 
                           payment_amount)) AS cpo_payment_amount
               FROM customer_payment_options
              WHERE payment_type != 'WALLET'
                AND payment_type != 'REWARD_CREDIT'
             GROUP BY tran_id

Notice you can simplify this as follows -- your IF() clause excludes rows with real_account_type = 'HQ'. You can do that in the WHERE clause instead.

             SELECT tran_id,
                    SUM(payment_amount) AS cpo_payment_amount
               FROM customer_payment_options
              WHERE payment_type != 'WALLET'
                AND payment_type != 'REWARD_CREDIT'
                AND real_account_type != 'HQ'
             GROUP BY tran_id

A compound index on (tran_id, payment_type, real_account_type, payment_amount) may help this subquery run faster. But the presence of those three != clauses guarantees a full index scan; there's no way to random access any index for those.

This generates a virtual table containing one row per tran_id with the sum you need.

Next we need to join that into your main query.

SELECT t.local_branch_revenue, 
       t.total_payment,
       IFNULL(cposum.cpo_payment_amount,0) cpo_payment_amount,
       b.ben_firstname, b.ben_lastname
  FROM transaction t
  LEFT JOIN beneficiary b ON b.id=t.ben_id
  LEFT JOIN (
             SELECT tran_id,
                    SUM(payment_amount) AS cpo_payment_amount
               FROM customer_payment_options
              WHERE payment_type != 'WALLET'
                AND payment_type != 'REWARD_CREDIT'
                AND real_account_type != 'HQ'
             GROUP BY tran_id
       ) cposum ON t.id = cposum.tran_id
 WHERE t.local_branch_id='31'
   AND DATE(t.date_added) < '2016-04-07'
   AND source_country_id='40'
   AND t.transaction_status != 'CANCELLED'

Do you see how we've changed the dependent summary subquery into its own virtual table? That lets the query planner run that query just once, rather than once for each row in the main query. That helps a lot.

The IFNULL() gets you a numeric value for cpo_payment_amount, rather than NULL, for transaction rows lacking any corresponding customer_payment_options rows.

A compound index on the transaction table on (local_branch_id, source_country_id, date_added) will help this query; the query engine can random access the local_branch_id and source_country_id values, then range scan the date_added value.

How do you learn to do this yourself? http://use-the-index-luke.com/ is a good start.

Upvotes: 2

Related Questions