Reputation: 4850
I've trying to optimize a query on MySQL that takes around 15-20 seconds to run. My Data table has about 10M rows, and the query is trying to return 68,000 records which match 144 "run" fields and 35 "name" fields. Because the query is using two in clauses, my indexes don't seem to be terribly helpful.
Here's the query:
select * from data d where
d.data_type='Result' and
(d.run in ('8a7aee1f2a6232b1012a624da9201b92', '8a7aee1f2a6232b1012a625432a314ef' ,
... [144 runs]
)) and (d.name like 'itema[%]' or d.name like 'itemb[%]')
Here's the table definition
CREATE TABLE `data` (
`data_type` varchar(31) NOT NULL,
`id` char(32) NOT NULL,
`entry_time` datetime default NULL,
`name` varchar(255) NOT NULL,
`step` int(11) default NULL,
`value` double NOT NULL,
`run` char(32) NOT NULL,
PRIMARY KEY (`id`),
KEY `FK2EEFAA8ECCC6F3` (`run`),
KEY `data2` (`run`,`step`),
KEY `data3` (`data_type`,`name(10)`,`run`),
CONSTRAINT `FK2EEFAA8ECCC6F3` FOREIGN KEY (`run`) REFERENCES `run_archive` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Explain tells me the query is using key data3.
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE d range FK2EEFAA8ECCC6F3,data2,data3 data3 223 NULL 113271 Using where
I used to run 144 queries (one for each run). It seems about twice as fast to do one query, but still way too slow.
Suggestions for optimizing? Ideas I have are:
Finding a magic index that speeds
this up
Denormalizing data (it'd be
easy to get rid of the run, but
harder for the name)
Splitting up the data among different tables (hard to do with my Java/Hibernate approach)
Or am I just asking the impossible here?
Edit: it turned out the biggest fix was to increase the size of my innodb_buffer_pool. The query went down to about 1 and a half seconds after doing this. I've marked as "answer" a fix that improved it slightly more.
Upvotes: 2
Views: 169
Reputation: 70564
Depending on how selective the condition on run
is, it might be better to provide the index
data_type, run, name(10)
The trouble with providing the column used for prefix matching early in the index is that it scatters matching rows across the index, requiring a bigger part of the index to be read from disk.
Also, using a smaller datatype for the id of run will reduce index size and speed up comparisions. This is a constant factor improvement, but might be worthwhile regardless.
Upvotes: 0
Reputation: 100567
Consider splitting result
records away from the data
table? I didn't catch what percentage your result
is, but perhaps it's worth benchmarking in a Dev copy of your Prod database.
Can you FK those run
values? If they're reusable(?), perhaps create a Run
table? My guesstimate is that 144 string matches, even indexed, is slower that if they were int
or smallint
. Again, benchmarking this suggestion, or any suggestion, will obviously prove the theory.
What does the query plan difference look like when not including your like clause on the name
attribute?
Upvotes: 1