Reputation: 11
when I select query in only one table, we got response with using index, But when using join statemnt In query, mysql Full scan on table?
First query
select *
from t_deposit_trx
where remit_tx_id = '3a33ff14-8d31-45d0-b64f-8a251c4b19a5'
1 SIMPLE t_deposit_trx ref t_deposit_trx_remit_tx_id_index
t_deposit_trx_remit_tx_id_index 110 const 1 Using index condition
Second Query
select tx_id
from t_settle_trx
where report_date >= '2019-03-01'
and report_date <= '2019-03-16'
and tx_type = 'CANCEL'
1 SIMPLE t_settle_trx range t_settle_trx_report_date_tx_type_index t_settle_trx_report_date_tx_type_index 196 5263 Using index condition
the 2 query is well operated. using index and speed is good.
But join the two table, it's very slow.
select * from t_deposit_trx
force index (t_deposit_trx_remit_tx_id_index)
where remit_tx_id in (
select tx_id
from t_settle_trx
where report_date >= '2019-03-01'
and report_date <= '2019-03-02'
and tx_type = 'CANCEL'
)
1 PRIMARY t_deposit_trx ALL 55724
1 PRIMARY t_settle_trx range t_settle_trx_report_date_tx_type_index t_settle_trx_report_date_tx_type_index 196 299 Using index condition; Using where; FirstMatch(t_deposit_trx); Using join buffer (flat, BNL join)
we can see above result.. t_settle_trx use range scan and get tx_id and next I hope the query use index "t_settle_trx_report_date_tx_type_index" But it use Full scan..
I dont' know why?
Upvotes: 0
Views: 1396
Reputation: 142518
Query 2 is not optimal. Turn the index around:
INDEX(tx_type, report_date)
That is, put the column(s) tested with =
first, regardless of cardinality.
IN ( SELECT ... )
is not a JOIN
. (In newer versions, it might be converted into a JOIN
.)
Give this a try:
SELECT d.*
FROM t_settle_trx AS s
JOIN t_deposit_trx AS d ON s.tx_id = d.remit_tx_id
WHERE s.tx_type = 'CANCEL'
AND s.report_date >= '2019-03-01'
AND s.report_date < '2019-03-01' + INTERVAL 2 DAY
s: INDEX(tx_type, report_date, tx_id)
d: INDEX(remit_tx_id)
Think of this SELECT
with a JOIN
as starting with the table that has WHERE
clauses.
(Note: I arranged the tables and the order of where-clauses so you can see how the Optimizer is thinking. The order of tables and the order of where-clauses is not relevant; the Optimizer will rearrange as it sees fit.)
My formulation should
Full table scan is not neccessarily a bad thing. If most of a table will be touched anyway, it is actually faster to simply scan the table and not bounce between the index's BTree and the data's BTree. (Your specific case had other deficiencies; I focused on them.)
INDEX(tx_type, report_date, tx_id)
is "covering" in that all the columns needed by the query are found in the one index. "Covering" gives you a slight extra performance boost. It is indicated in EXPLAIN
by Using index
(not Using index condition
).
With tx_seq, this may be optimal:
INDEX(tx_type, report_date, tx_seq)
Upvotes: 1
Reputation: 11
Guys.. Thank you about your interest about my question.
Above 2 tables has same index structure in staging environment and production environment. and using galera cluster.
What I am actually wonder is why they use different "execution plan"? Especially in production environment, JOIN query is very slow.
What I do yesterday is..
select * from t_deposit_trx
where remit_tx_id in (
select tx_id
from t_settle_trx
where report_date >= '2019-03-01'
and report_date <= '2019-03-02'
and tx_type = 'CANCEL'
)
I change this query slightly like
select * from t_deposit_trx
where tx_seq in (
select tx_seq
from t_settle_trx
where report_date >= '2019-03-01'
and report_date <= '2019-03-02'
and tx_type = 'CANCEL'
)
and then first t_settle_trx "range scan" and the second t_deposit_trx "Index scan".. but the tow tx_seq value is different value. It didn't have any relation ship. It's only test if they can use index when join query.
You know what I mean? that means they can use index when joining.
So Is there any environment variable problem or Key Length limit or anythinig?
Thank you for reading me.
Upvotes: 0
Reputation: 7124
I hope that this will not be slower than your first query:
SELECT t1.* FROM t_deposit_trx t1
INNER JOIN t_settle_trx t2
ON t1.remit_tx_id=t2.tx_id
WHERE t2.report_date >= '2019-03-01'
AND t2.report_date <= '2019-03-02'
AND t2.tx_type='CANCEL';
Upvotes: 0