Reputation: 34556
Probably through poor database design, the following really simple query is taking ~1.5 minutes to run.
SELECT s.title, t.name AS team_name
FROM stories AS s
JOIN teams AS t ON s.team_id = t.id
WHERE s.pubdate >= "1970-01-01 00:00"
ORDER BY s.hits /* <-- here's the problem */
LIMIT 3 OFFSET 0
The problem is the stories
table is fairly big, with ~1.5m rows, and there's a ton of unique values for hits
(this column logs the hits to each story.)
Take out the order clause and it resolves almost instantly.
Question: what can I do to optimise for queries like this? Presumably I shouldn't apply an index to hits
since direct no look-ups take place on that column.
[UPDATE]
SHOW CREATE TABLE
for all tables concerned:
CREATE TABLE stories (
`id` varchar(11) NOT NULL,
`link` text NOT NULL,
`title` varchar(255) CHARACTER SET utf8 NOT NULL,
`description` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
`pubdate` datetime NOT NULL,
`source_id` varchar(11) NOT NULL,
`team_id` varchar(11) NOT NULL,
`hits` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `Unique combo (title + date)` (`title`,`pubdate`),
KEY `team (FK)` (`team_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
CREATE TABLE teams (
`id` varchar(11) NOT NULL,
`is_live` enum('y') DEFAULT NULL,
`name` varchar(50) NOT NULL,
`short_name` varchar(12) DEFAULT NULL,
`server` varchar(11) DEFAULT NULL,
`url_token` varchar(255) NOT NULL,
`league` varchar(11) NOT NULL,
`away_game_id` varchar(255) DEFAULT NULL,
`digest_list_id` varchar(25) DEFAULT NULL,
`twitter_handle` varchar(255) DEFAULT NULL,
`no_official_news` enum('y') DEFAULT NULL,
`alt_names` varchar(255) DEFAULT NULL,
`no_use_nickname` enum('y') DEFAULT NULL,
`official_hashtag` varchar(30) DEFAULT NULL,
`merge_news_and_fans` enum('y') DEFAULT NULL,
`colour_1` varchar(6) NOT NULL,
`colour_2` varchar(6) DEFAULT NULL,
`colour_3` varchar(6) DEFAULT NULL,
`link_colour_modifier` float DEFAULT NULL,
`alt_link_colour_modifier` float DEFAULT NULL,
`title_shade` enum('dark','light') NOT NULL,
`shirt_style` enum('vert_stripes','horiz_stripes','vert_stripes_thin','horiz_stripes_thin','vert_split','horiz_split') DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `URL token` (`url_token`),
KEY `league (FK)` (`league`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
Upvotes: 0
Views: 54
Reputation: 142296
Consider removing the filter on pubdate if the user does not need it. It confuses the optimizer.
INDEX(hits, pubdate, title)
will probably help the query the most. It is "covering".
The reason why removing ORDER BY
runs fast: Without it, it gives you any 3 rows. With it, and without a useful index, it needs to sort the 1.5M rows to discover the 3 with the least number of hits.
Perhaps you wanted ORDER BY s.hits DESC
? -- to get those with the most hits.
Upvotes: 1