Reputation: 1078
I have a MYISAM MySQL DB table with many millions of rows inside which I've been asked to work with, but I need to make the queries faster first.
There was no indexing before at all! I added a new index on the 'type' column which has helped but I wanted to know if there were any other columns that might be best indexed too?
Here is my CREATE TABLE:
CREATE TABLE `clicks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`companyid` int(11) DEFAULT '0',
`type` varchar(32) NOT NULL DEFAULT '',
`contextid` int(11) NOT NULL DEFAULT '0',
`period` varchar(16) NOT NULL DEFAULT '',
`timestamp` int(11) NOT NULL DEFAULT '0',
`location` varchar(32) NOT NULL DEFAULT '',
`ip` varchar(32) DEFAULT NULL,
`useragent` varchar(64) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `companyid` (`companyid`,`type`,`period`),
KEY `type` (`type`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
A typical SELECT statement would commonly filter by the companyid
, type
and contextid
columns.
For example:
SELECT period, count(period) as count FROM clicks WHERE contextid in (123) AND timestamp > 123123123 GROUP BY period ORDER BY timestamp ASC
or
SELECT period, count(period) as count FROM clicks WHERE contextid in (123) AND type IN('direct') AND timestamp > 123123123 GROUP BY period ORDER BY timestamp ASC
The last part of my questions would be this: when I added the index on type
it took about 1 hour - if I am adding or removing multiple indexes, can you do it in one query or do you have to do them 1 by 1 and wait for each to finish?
Thanks for your thoughts.
Upvotes: 1
Views: 1724
Reputation: 53830
You can add multiple indexes in a single query. This will save some time overall, but the table will be inaccessible while you wait for the entire query to complete:
ALTER TABLE table1 ADD INDEX `Index1`('col1'),
ADD INDEX `Index2`('col2')
Regarding indexes, it's a complex subject. However, adding indexes on single columns with high-cardinality that are included in your WHERE clause is a good place to start. MySQL will try to pick the best index for the query and use that.
To further tweak performance, you should consider multi-column indexes, which I see you've implemented with your 'companyid' index.
To be able to utilize an index all the way through to a GROUP BY or ORDER BY clause relies on a lot of conditions, that you might want to read up on.
To best utilize indexes, your database server must have enough RAM to store the indexes entirely in memory and the server must be configured properly to actually utilize the memory.
Upvotes: 0
Reputation: 7871
In my opinion timestamp
and period
can be indexed as they are being used in the WHERE
clause.
Also instead of using contextid in (123)
use contextid = 123
and instead of type IN('direct')
use type = 'direct'
Upvotes: 0
Reputation: 7268
Which indexes to add really depends on your queries. Anything that you're sorting (GROUP BY) or selecting (WHERE) on is a good candidate for an index.
You may also want to have a look at how Mysql uses indexes.
As regards the time taken to add indexes, where you're sure you want to add multiple indexes, you could do mysqldump, manually edit the table structure in the .sql file, and then reimport. This can take a while, but at least you can do all the changes at once. However, this doesn't really fit with the idea of testing as you go... so use this approach with care. (I've done it when modifying a number of tables with the same structure, and wanting to add some indexes to all of them.)
Also, I'm not a 100% sure, but I think that when you add an index, Mysql creates a copy of the table with the index, and then deletes the original table - so make sure there's enough space on your server / partition for the current size of the table & some margin.
Upvotes: 1
Reputation: 6450
Indexing is really powerful, but isn't as much of a black art as you might think. Learn about MySQL's EXPLAIN PLAN capabilities, this will help you systematically find where improvements can be made:
http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html
Upvotes: 3
Reputation: 51445
Here's one of your queries, broken on multiple lines so it's easier to read.
SELECT period, count(period) as count
FROM clicks
WHERE contextid in (123)
AND timestamp > 123123123
GROUP BY period
ORDER BY timestamp ASC
I'm not even sure this is a valid query. I thought your GROUP BY and ORDER BY had to match up in SQL. I think that you would have to order on count
, as the GROUP BY would order on period
.
The important part of the query for optimization is the WHERE clause. In this case, an index on contextid
and timestamp
would speed up the query.
Obviously, you can't index every WHERE clause. You index the most common WHERE clauses.
I'd add indexes to existing tables one at a time. Yes, it's slow. But you should only have to add the indexes once.
Upvotes: 0