kommradHomer
kommradHomer

Reputation: 4210

simple mysql query working slower than nested Select

I am doing a simple Select from a single table.

CREATE TABLE `book` (
 `Book_Id` int(10) NOT NULL AUTO_INCREMENT,
 `Book_Name` varchar(100) COLLATE utf8_turkish_ci DEFAULT NULL ,
 `Book_Active` bit(1) NOT NULL DEFAULT b'1' ,
 `Author_Id` int(11) NOT NULL,
 PRIMARY KEY (`Book_Id`),
 KEY `FK_Author` (`Author_Id`),
 CONSTRAINT `FK_Author` FOREIGN KEY (`Author_Id`) REFERENCES `author` (`Author_Id`) ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=5947698 DEFAULT CHARSET=utf8 COLLATE=utf8_turkish_ci ROW_FORMAT=COMPACT

table : book

columns :
Book_Id (INTEGER 10) | Book_Name (VARCHAR 100) | Author_Id (INTEGER 10) | Book_Active (Boolean)

I have Indexes on three columns : Book_Id (PRIMARY key) , Author_Id (FK) , Book_Active .

first query :

SELECT * FROM book WHERE Author_Id = 1 AND Book_Active = 1

EXPLAIN : 
id  select_type table   type    possible_keys                   key         key_len ref     rows    Extra
1   SIMPLE      book    ref     FK_Author,index_Book_Active     FK_Author   4       const   4488510 Using where

second query :

SELECT b.* FROM book b 
WHERE Book_Active=1 
AND Book_Id IN (SELECT Book_Id FROM book WHERE Author_Id=1)
EXPLAIN :
id  select_type         table   type            possible_keys       key                 key_len ref     rows    Extra
1   PRIMARY             book    ref             index_Book_Active   index_Book_Active   1       const   9369399 Using where
2   DEPENDENT SUBQUERY  book    unique_subquery PRIMARY,FK_Author   PRIMARY             4       func    1       Using where

The data statistics is like this :

16.8 million books 
10.5 million Book_Active=true
6.3 million Book_Active = false

And For Author_Id=1

2.4 million Book_Active=false
5000 Book_Active=true 

The first query takes 6.7 seconds . The second query takes 0.0002 seconds

What is the cause of this enormous difference ? is it the right thing to use the nested select query ?

edit: added "sql explain"

Upvotes: 3

Views: 126

Answers (1)

Vladimir Kolesnikov
Vladimir Kolesnikov

Reputation: 991

In the first case: MySQL uses FK_Author index (this gives us ~4.5M rows), then it has to match every row against Book_Active = 1 condition — index cannot be used here.

The second case: InnoDB implicitly adds primary key to every index. Thus, when MySQL executes this part: SELECT book.* FROM book WHERE Book_Active=1 it has Book_Id from the index. Then for the subquery it has to match Book_Id with Author_Id; Author_Id is a constant and is a prefix of the index; implicitly included primary key can be matched against implicit primary key from Book_Active index. In your case it is faster to intersect two indices than to use index to retrieve 4.5M rows and scan them sequentially.

Upvotes: 2

Related Questions