Reputation: 4210
I am doing a simple Select from a single table.
CREATE TABLE `book` (
`Book_Id` int(10) NOT NULL AUTO_INCREMENT,
`Book_Name` varchar(100) COLLATE utf8_turkish_ci DEFAULT NULL ,
`Book_Active` bit(1) NOT NULL DEFAULT b'1' ,
`Author_Id` int(11) NOT NULL,
PRIMARY KEY (`Book_Id`),
KEY `FK_Author` (`Author_Id`),
CONSTRAINT `FK_Author` FOREIGN KEY (`Author_Id`) REFERENCES `author` (`Author_Id`) ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=5947698 DEFAULT CHARSET=utf8 COLLATE=utf8_turkish_ci ROW_FORMAT=COMPACT
table : book
columns :
Book_Id (INTEGER 10) | Book_Name (VARCHAR 100) | Author_Id (INTEGER 10) | Book_Active (Boolean)
I have Indexes on three columns : Book_Id (PRIMARY key) , Author_Id (FK) , Book_Active .
first query :
SELECT * FROM book WHERE Author_Id = 1 AND Book_Active = 1
EXPLAIN :
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE book ref FK_Author,index_Book_Active FK_Author 4 const 4488510 Using where
second query :
SELECT b.* FROM book b
WHERE Book_Active=1
AND Book_Id IN (SELECT Book_Id FROM book WHERE Author_Id=1)
EXPLAIN :
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY book ref index_Book_Active index_Book_Active 1 const 9369399 Using where
2 DEPENDENT SUBQUERY book unique_subquery PRIMARY,FK_Author PRIMARY 4 func 1 Using where
The data statistics is like this :
16.8 million books
10.5 million Book_Active=true
6.3 million Book_Active = false
And For Author_Id=1
2.4 million Book_Active=false
5000 Book_Active=true
The first query takes 6.7 seconds . The second query takes 0.0002 seconds
What is the cause of this enormous difference ? is it the right thing to use the nested select query ?
edit: added "sql explain"
Upvotes: 3
Views: 126
Reputation: 991
In the first case: MySQL uses FK_Author
index (this gives us ~4.5M rows), then it has to match every row against Book_Active = 1
condition — index cannot be used here.
The second case: InnoDB implicitly adds primary key to every index. Thus, when MySQL executes this part: SELECT book.* FROM book WHERE Book_Active=1
it has Book_Id
from the index. Then for the subquery it has to match Book_Id
with Author_Id
; Author_Id
is a constant and is a prefix of the index; implicitly included primary key can be matched against implicit primary key from Book_Active
index. In your case it is faster to intersect two indices than to use index to retrieve 4.5M rows and scan them sequentially.
Upvotes: 2