Reputation: 2804
I'm using MySQL, although I suspect this is a generic database question.
I have a table consisting of 6 numeric columns. The first 5 of these make up the primary key.
It is a large table (20 million rows and growing), so some queries take time - about 10secs, which in itself is not too long, but I need to run a lot of them.
I understand that the primary key is automatically indexed - is there any advantage in me separately indexing some groups of columns within the primary key that I usually query on?
That is,, if I regularly query on the first 3 of the 5 primary key columns, should I create an additional index for these 3, or is that redundant because it's already part of the primary key index?
Upvotes: 4
Views: 1694
Reputation: 108806
Ten seconds is quite a long time for a query that returns one or a tiny handful of rows. If the query is returning 3% of the table's contents, though, ten seconds is not too long.
Your primary unique key is backed up by a composite index, let's say an index on
(I1,I2,I3,I4,I5)
You are correct that a query like
WHERE I1 = val AND I2 = val AND I3 = val
and
WHERE I3 = val AND I2 = val AND I1 = val
should use the index created for the primary key. The important thing is that the columns in the composite index are all used, starting with the leftmost one. A query like
WHERE I3 = val AND I4 = val AND I5 = val
won't use the primary key's composite index very well, if at all. Neither will a query that does some kind of computation on the column values mentioned in the key, like
WHERE I1+I2+I3=sumvalue
Keep in mind that "should work" is not the same as "does work." Try using the EXPLAIN command in MySQL to figure out whether the DBMS is doing what you expect it to for your query.
http://dev.mysql.com/doc/refman/5.1/en/explain.html
Upvotes: 3
Reputation: 3754
Why not just create a few test queries, create the index on a copy of the table and see how it performs?
When it comes to performance, measuring is always better than trusting an opinion.
The "best" solution in a database largely depends on the specific details of the table(s) involved. What range of values in the columns, what distribution of values, what type of queries, relative frequency of select/delete/insert/update queries, etc.
That being said, my guess is that an index on a subset will help if that subset contains all columns used in a query. You might get better performance if you include the result set (column in the select) in the index.
Upvotes: 0