Reputation: 16304
Table structure:
CREATE TABLE `mytable` (
`id` varchar(8) NOT NULL,
`event` varchar(32) NOT NULL,
`event_date` date NOT NULL,
`event_time` time NOT NULL,
KEY `id` (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
The data in this table looks like this:
id | event | event_date | event_time
---------+------------+-------------+-------------
ref1 | someevent1 | 2010-01-01 | 01:23:45
ref1 | someevent2 | 2010-01-01 | 02:34:54
ref1 | someevent3 | 2010-01-18 | 01:23:45
ref2 | someevent4 | 2012-10-05 | 22:23:21
ref2 | someevent5 | 2012-11-21 | 11:22:33
The table contains about 500.000.000 records similar to this.
The query I'd like to ask about here looks like this:
SELECT *
FROM `mytable`
WHERE `id` = 'ref1'
ORDER BY event_date DESC,
event_time DESC
LIMIT 0, 500
The EXPLAIN
output looks like:
select_type: SIMPLE
table: E
type: ref
possible_keys: id
key: id
key_len: 27
ref: const
rows: 17024 (a common example)
Extra: Using where; Using filesort
Purpose:
This query is generated by a website, the LIMIT
-values are for page navigation element, so if the user wants to see older entries, they'll get adjusted to 500, 500
, then 1000, 500
and so on.
Since some items in the field id
can be set in quite a lot of rows, more and more rows will of course lead to a slower query. Profiling those slow queries showed me the reason is the sorting, most of the time during the query the mysql server is busy sorting the data. Indexing the fields event_date
and event_time
didn't change that very much.
Example SHOW PROFILE
Result, sorted by duration:
state | duration/sec | percentage
---------------|--------------|-----------
Sorting result | 12.00145 | 99.80640
Sending data | 0.01978 | 0.16449
statistics | 0.00289 | 0.02403
freeing items | 0.00028 | 0.00233
...
Total | 12.02473 | 100.00000
Now the question:
Before delving way deeper into the mysql variables like sort_buffer_size
and other server configuration option, can you think of any way to change the query or the sorting behaviour so sorting ain't that big performance eater anymore and the purpose of this query is still in place?
I don't mind a bit of out-of-the-box-thinking.
Thank you in advance!
Upvotes: 1
Views: 179
Reputation: 2973
As I wrote in comment multi-column index (id, evet_date desc, event_time desc) may help.
If this table will grow fast you should consider to adding option in application for user to select data for particular date range.
Example: First step always return 500 records but to select next records user should set date range for data and then set pagination.
Upvotes: 2
Reputation: 29629
I would start by doing what sufleR suggests - the multi-column index on (id, event_date desc, event_time desc).
However, according to http://dev.mysql.com/doc/refman/5.0/en/create-index.html, the DESC keyword is supported, but doesn't actually do anything. That's a bit of a pain - so try it, and see if it improves the performance, but it probably won't.
If that's the case, you may have to cheat by creating a "sort_column", with an automatically decrementing value (pretty sure you'd have to do this in the application layer, I don't think you can decrement in MySQL), and add that column to the index.
You'd end up with:
id | event | event_date | event_time | sort_value
---------+------------+-------------+-------------------------
ref1 | someevent1 | 2010-01-01 | 01:23:45 | 0
ref1 | someevent2 | 2010-01-01 | 02:34:54 | -1
ref1 | someevent3 | 2010-01-18 | 01:23:45 | -2
ref2 | someevent4 | 2012-10-05 | 22:23:21 | -3
ref2 | someevent5 | 2012-11-21 | 11:22:33 | -4
and and index on ID and sort_value.
Dirty, but the only other suggestion is to reduce the number of records matching the where clause in other ways - for instance, by changing the interface not to return 500 records, but records for a given date.
Upvotes: 1
Reputation: 2381
Indexing is most likely the solution; you just have to do it right. See the mysql reference page for this.
The most effective way to do it is to create a three-part index on (id, event_date, event_time)
. You can specify event_date desc, event_time desc
in the index, but I don't think it's necessary.
Upvotes: 1