Manu
Manu

Reputation: 4123

Mysql optimization help needed

I have a table TABLE_A with 4397898 records

+-------------------+---------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------------+---------------+------+-----+---------+----------------+ | id | bigint(11) | NO | PRI | NULL | auto_increment | | usrid | int(11) | YES | | NULL | | | grpid | int(11) | YES | | NULL | | | catid | int(11) | YES | MUL | NULL | | | folderid | int(11) | NO | MUL | 5 | | | popid | int(11) | YES | MUL | NULL | | | accid | int(11) | YES | MUL | NULL | | | contentid | bigint(11) | YES | MUL | NULL | | | priority | smallint(6) | YES | | NULL | | | rating | smallint(6) | NO | | 3 | | | fromid | int(11) | YES | | NULL | | | ctxid | varchar(255) | YES | | NULL | | | ctxmsgid | varchar(255) | YES | | NULL | | | starred | enum('Y','N') | YES | | N | | | links | enum('y','n') | YES | | n | | +-------------------+---------------+------+-----+---------+----------------+

and it is indexed as below

+---------+----+--------------+-----+-------------+-----+-------------+----------+--------+------+------------+ | Table | Nu | Key_name | Seq | Column_name | Col | Cardinality | Sub_part | Packed | Null | Index_type | +---------+----+--------------+-----+-------------+-----+-------------+----------+--------+------+------------+ | TABLE_A | 0 | PRIMARY | 1 | id | A | 4617132 | NULL | NULL | | BTREE | | TABLE_A | 1 | catIndx | 1 | catid | A | 256507 | NULL | NULL | YES | BTREE | | TABLE_A | 1 | contentIndx | 1 | contentid | A | 4617132 | NULL | NULL | YES | BTREE | | TABLE_A | 1 | catIndx_2 | 1 | catid | A | 18 | NULL | NULL | YES | BTREE | | TABLE_A | 1 | catIndx_2 | 2 | popid | A | 2013 | NULL | NULL | YES | BTREE | | TABLE_A | 1 | folderidIndx | 1 | folderid | A | 13619 | NULL | NULL | | BTREE | | TABLE_A | 1 | accIndex | 1 | accid | A | 1532 | NULL | NULL | YES | BTREE | | TABLE_A | 1 | popindx | 1 | popid | A | 1532 | NULL | NULL | YES | BTREE | +---------+----+--------------+-----+-------------+-----+-------------+----------+--------+------+------------+

The following query

explain SELECT intCommIndx FROM TABLE_A WHERE ( (popid IN('-1',2407 ,22 ,1203 ,1342 ,1207 ,3 ,1254 ,2663 ,1250 ,3461 ,1251 ,14 ,1174 ,120 ,2406 ,2402 ,325 ,925 ,1210 ,2280 ,1 ,1202 ,1560 ,775 ,776 ,789 ,777 ,778 ,12 ,779 ,780 ,781 ,782 ,783 ,784 ,785 ,786 ,787 ,788 ,1209 ,19 ,26 ,9 ,24 ,4 ,25 ,21 ,18 ,1309 ,967 ,1212 ,6 ,9633 ,5 ,2671 ,17 ,13 ,1211 ,749 ,752 ,747 ,750 ,748 ,9302 ,1470 ,190 ,188 ,9711 ,9710 ,9512 ,11512 ,9514 ,9515 ,9516 ,11511 ,9513 ,9316 ,9453 ,1641 ,4986 ,1639 ,1640 ,7814 ,10042 ,9452 ,11236 ,11241 ,11238 ,11239 ,11237 ,11242 ,11240 ,1711 ) OR intpop3indx = -1) AND catid = 5 )

explain SELECT id FROM TABLE_A WHERE ( (popid IN('-1',2407 ,22 ,1203 ,1342 ,1207 ,3 ,1254 ,2663 ,1250 ,3461 ,1251 ,14 ,1174 ,120 ,2406 ,2402 ,325 ,925 ,1210 ,2280 ,1 ,1202 ,1560 ,775 ,776 ,789 ,777 ,778 ,12 ,779 ,780 ,781 ,782 ,783 ,784 ,785 ,786 ,787 ,788 ,1209 ,19 ,26 ,9 ,24 ,4 ,25 ,21 ,18 ,1309 ,967 ,1212 ,6 ,9633 ,5 ,2671 ,17 ,13 ,1211 ,749 ,752 ,747 ,750 ,748 ,9302 ,1470 ,190 ,188 ,9711 ,9710 ,9512 ,11512 ,9514 ,9515 ,9516 ,11511 ,9513 ,9316 ,9453 ,1641 ,4986 ,1639 ,1640 ,7814 ,10042 ,9452 ,11236 ,11241 ,11238 ,11239 ,11237 ,11242 ,11240 ,1711 ) OR popid = -1) AND catid = 5 ) gives

+----+-------------+---------+------+---------------------------+---------+---------+-------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------+---------------------------+---------+---------+-------+--------+-------------+ | 1 | SIMPLE | TABLE_A | ref | catIndx,catIndx_2,popindx | catIndx | 5 | const | 649800 | Using where | +----+-------------+---------+------+---------------------------+---------+---------+-------+--------+-------------+

How to improve the speed of the query?

There are only 850 rows in the result. mysql> SELECT -> count(id) -> FROM TABLE_A -> WHERE ( -> (popid IN('-1',2407,22,1203,1342,1207,3,1254,2663,1250,3461,1251,14,1174,120,2406,2402,325,925,1210,2280,1,1202,1560,775,776,789,777,778,12,779,780,781,782,783,784,785,786,787,788,1209,19,26,9,24,4,25,21,18,1309,967,1212,6,9633,5,2671,17,13,1211,749,752,747,750,748,9302,1470,190,188,9711,9710,9512,11512,9514,9515,9516,11511,9513,9316,9453,1641,4986,1639,1640,7814,10042,9452,11236,11241,11238,11239,11237,11242,11240,1711) OR intpop3indx = -1) -> AND catid = 5 ); +--------------------+ | count(id) | +--------------------+ | 850 | +--------------------+ 1 row in set (11.22 sec)

What changes can I make to get these 850 records within milliseconds?

Upvotes: 1

Views: 40

Answers (3)

spencer7593
spencer7593

Reputation: 108510

Your query is selecting column intCommIndx from the table, but that column does not appear as one of the table columns.

The quick answer is to create a "covering index" e.g.

... ON TABLE_A (catid, intpop3indx, popid, intCommIndx)

EXPLAIN output should show "Using index" in the Extra column, which means the query is satisfied entirely from the index without a need to look up pages from the table.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1271151

This is the query:

SELECT intCommIndx 
FROM TABLE_A
WHERE ( (popid IN('-1',2407 ,22 ,1203   ,1342   ,1207   ,3  ,1254   ,2663   ,1250   ,3461   ,1251   ,14 ,1174   ,120    ,2406   ,2402   ,325    ,925    ,1210   ,2280   ,1  ,1202   ,1560   ,775    ,776    ,789    ,777    ,778    ,12 ,779    ,780    ,781    ,782    ,783    ,784    ,785    ,786    ,787    ,788    ,1209   ,19 ,26 ,9  ,24 ,4  ,25 ,21 ,18 ,1309   ,967    ,1212   ,6  ,9633   ,5  ,2671   ,17 ,13 ,1211   ,749    ,752    ,747    ,750    ,748    ,9302   ,1470   ,190    ,188    ,9711   ,9710   ,9512   ,11512  ,9514   ,9515   ,9516   ,11511  ,9513   ,9316   ,9453   ,1641   ,4986   ,1639   ,1640   ,7814   ,10042  ,9452   ,11236  ,11241  ,11238  ,11239  ,11237  ,11242  ,11240  ,1711   ) OR intpop3indx = -1
        ) AND
        catid = 5
       )

Queries with or can be quite hard to optimize. I would recommend creating two indexes on the table and then rewriting the query. The two indexes are intCommIndx(catid, popid, intCommIndx) and intCommIndx(catid, intpop3indx, intCommIndx). Then the new query is:

SELECT intCommIndx 
FROM TABLE_A
WHERE catid = 5 and
      popid IN ('-1',2407 ,22 ,1203   ,1342   ,1207   ,3  ,1254   ,2663   ,1250   ,3461   ,1251   ,14 ,1174   ,120    ,2406   ,2402   ,325    ,925    ,1210   ,2280   ,1  ,1202   ,1560   ,775    ,776    ,789    ,777    ,778    ,12 ,779    ,780    ,781    ,782    ,783    ,784    ,785    ,786    ,787    ,788    ,1209   ,19 ,26 ,9  ,24 ,4  ,25 ,21 ,18 ,1309   ,967    ,1212   ,6  ,9633   ,5  ,2671   ,17 ,13 ,1211   ,749    ,752    ,747    ,750    ,748    ,9302   ,1470   ,190    ,188    ,9711   ,9710   ,9512   ,11512  ,9514   ,9515   ,9516   ,11511  ,9513   ,9316   ,9453   ,1641   ,4986   ,1639   ,1640   ,7814   ,10042  ,9452   ,11236  ,11241  ,11238  ,11239  ,11237  ,11242  ,11240  ,1711   )
UNION
SELECT intCommIndx 
FROM TABLE_A
WHERE catid = 5 and intpop3indx = -1;

This will allow each subquery to be satisfied only using the indexes.

Upvotes: 1

Kabulan0lak
Kabulan0lak

Reputation: 2136

I would try to put the last conditions in first position. So when one of them isn't verified, we pass directly to the other row :

catid = 5 AND (intpop3indx = -1 OR popid IN (...))

So when catid != 5, the query passes directly to the other row. (Even if catid = 5 AND intpop3indx = -1)

Upvotes: 0

Related Questions