About the sql performance of select ... in

Question

Mysql 5.7.21

I use pool to connect database and run the SQL

let mysql = require('mysql');
let pool = mysql.createPool(db);
pool.getConnection((err, conn) => {
      if(err){
        ...
      }else{
        console.log('allConnections:' + pool._allConnections.length);
        let q = conn.query(sql, val, (err, rows,fields) => {
        ...

I have a table with around 1,000,000 records. I wrote a select to fecth the records.

select * from tableA where trackingNo in (?)

I will send the trackingNo via array param. The amount of trackingNo is around 20000. It means the length of array is around 20000.

And I made the index to trackingNo column. (trackingNo column is varchar type, not unique, can be null, blank and all possible values)

The problem is, I find it will cost around 5 minutes to get the results! 5 minutes here means purely backend sql handling time. I think it is too slow to match 20000 records in 1,000,000 records. Do you have any suggestion for select.. in ?

Explain SQL:

id  select_type table   partitions  type    possible_keys          key    key_len   ref   rows   filtered   Extra
1   SIMPLE      tableA  null        ALL     table_tracking_no_idx  null   null      null  999507    50      Using where

Tim Biegeleisen · Accepted Answer

You could consider populating a table with the tracking numbers you want to match. Then, you could use an inner join instead of your current WHERE IN approach:

SELECT *
FROM tableA a
INNER JOIN tbl b
    ON a.trackingNo = b.trackingNo;

This has the advantage that you may index the new tbl table on the trackingNo column to make the join lookup extremely fast.

This assumes that tbl would have a single column trackingNo which contains the 20K+ values you need to consider.

About the sql performance of select ... in

Answers (2)

Related Questions