Reputation: 690
I have a table like this:
create table test (
id int primary key auto_increment,
idcard varchar(30),
name varchar(30),
custom_value varchar(50),
index i1(idcard)
)
I insert 30,000,000 rows to the table
and then I execute:
select * from test where idcard='?'
The statement cost 12 seconds to return
when I use iostat to monitor disk
the read speed is about 6 mb/s while the util is 94%
is any way to optimize it?
Upvotes: 2
Views: 2506
Reputation: 142298
12 seconds may be realistic.
Assumptions about the question:
CHARACTER SET
and COLLATION
for idcard
.INDEX(idcard)
exists and is used in the query.Here's a breakdown of the processing:
?
, scan forward until hitting an entry that is not ?
(about 3K rows later).SELECT *
.Step 1: Fast.
Step 2: This is (based on the assumption of not being cached) costly. It may involve about 3K disk hits. For an HDD, that would be about 30 seconds. So, 12 seconds could imply some of the stuff was cached or happened to be near each other.
Step 3: This is a network cost, which I am not considering.
Run the query a second time. It may take only 1 second the this time -- because all 3K blocks are cached in RAM! And iostat will show zero activity!
is any way to optimize it?
Well...
innodb_buffer_pool_size
should be about 70% of available RAM, but not so big that it leads to swapping. What is its setting, and how much RAM do you have and what else is running on the machine?SHOW TABLE STATUS
.) If you can't make the buffer_pool big enough for it, and you have a variety of queries that compete for different parts of this (and other) tables, then more RAM may be beneficial.Upvotes: 2
Reputation: 750
Seems more like an I/O limitation than something that could be solved by adding indices. What will improve the speed is change the collation of the idcard column to latin1_bin. This uses only 1 byte per character. It also uses binary comparison which is faster than case insensitive comparison.
Only do this if you have no special characters in the idcard column, because the character set of latin1 is quite limited.
ALTER TABLE `test` CHANGE COLUMN `idcard` `idcard` VARCHAR(30) COLLATE 'latin1_bin' AFTER `id`;
Furthermore the ROW_FORMAT=FIXED also improves the speed. ROW_FORMAT=FIXED is not available using the InnoDB engine, but it is with MyISAM. The resulting table I now have is shown below. It's 5 times quicker (80% less time) with select statements than the initial table.
Note that I also changed the collation for 'name' and 'custom_value' to latin1_bin. This does make quite a difference in speed in my test setup, and I'm still figuring out why.
CREATE TABLE `test` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`idcard` VARCHAR(30) COLLATE 'latin1_bin',
`name` VARCHAR(30) COLLATE 'latin1_bin',
`custom_value` VARCHAR(50) COLLATE 'latin1_bin',
PRIMARY KEY (`id`),
INDEX `i1` (`idcard`)
)
ENGINE=MyISAM
ROW_FORMAT=FIXED ;
Upvotes: 0
Reputation: 521239
You may try adding the three other columns in the select clause to the index:
CREATE INDEX idx ON test (idcard, id, name, custom_value);
The three columns other than idcard
are being added to allow the index to cover everything being selected. The problem with your current index is that it is only on idcard
. This means that once MySQL has traversed down to each leaf node in the index, it would have to do another seek back to the clustered index to lookup the values of all columns mentioned in the select *
. As a result of this, MySQL may choose to ignore the index completely. The suggestion I made above avoids this additional seek.
Upvotes: -1