Reputation: 51
We have a table with about 25,000,000 rows called 'events' having the following schema:
TABLE events
- campaign_id : int(10)
- city : varchar(60)
- country_code : varchar(2)
The following query takes VERY long (> 2000 seconds):
SELECT COUNT(*) AS counted_events, country_code
FROM events
WHERE campaign_id` in (597)
GROUPY BY city, country_code
ORDER BY counted_events
We found out that it's because of the GROUP BY
part.
There is already an index idx_campaign_id_city_country_code on (campaign_id, city, country_code)
which is used.
Maybe someone can suggest a good solution to speed it up?
Update:
'Explain' shows that out of many possible index MySql uses this one: 'idx_campaign_id_city_country_code', for rows it shows: '471304' and for 'Extra' it shows: 'Using where; Using temporary; Using filesort' –
Here is the whole result of EXPLAIN:
UPDATE:
Ok, I think it has been solved:
Looking at the pasted query here again I realized that I forget to mention here that there was one more column in the SELECT called 'country_name'. So the query was very slow then (including country_name), but I'll just leave it out and now the performance of the query is absolutely ok. Sorry for that mistake!
So thank you for all your helpful comments, I'll upvote all the good answers! There were some really helpful additions, that I probably also we apply (like changing types etc).
Upvotes: 5
Views: 1582
Reputation: 699
The problem is, that MySQL doesn't use the index for sorting. I cannot say why, because it should. Could be a bug.
The best strategy to execute this query is to scan that sub-tree of the index where event_id=597. Since the index is then sorted by city_id, country_code no extra sorting is needed and rows can be counted while scanning.
So the indexes are already optimal for this query. MySQL is just not using them correctly.
I'm getting more information off line. It seems this is not a database problem at all, but
As soon as country_name is dropped from the select list, the query reverts to an index-only scan ("using index" in EXPLAIN output) and is blazingly fast.
Upvotes: 0
Reputation: 11655
Some ideas:
Given the nature and size of the table it would be a great candidate for partitioned tables by country. This way the events of every country would be stored in a different physical table even if it behaves as a virtual big table
Is country code an string? May be you have a country_id that could be easier to sort. (It may force you to create or change indexes)
Are you really using the city in the group by?
Upvotes: 0
Reputation: 2230
without seeing what EXPLAIN says it's a long distance shot, anyway:
post entire EXPLAIN output
Upvotes: 3
Reputation: 4481
don't use IN()
- better use:
WHERE campaign_id = 597
OR campaign_id = 231
OR ....
afaik IN()
is very slow.
update: like nik0lias commented - IN()
is faster than concatenating OR
conditions.
Upvotes: 0