mcmillab
mcmillab

Reputation: 2804

Creating an Index that is a subset of another Index in MySQL

I'm using MySQL, although I suspect this is a generic database question.

I have a table consisting of 6 numeric columns. The first 5 of these make up the primary key.

It is a large table (20 million rows and growing), so some queries take time - about 10secs, which in itself is not too long, but I need to run a lot of them.

I understand that the primary key is automatically indexed - is there any advantage in me separately indexing some groups of columns within the primary key that I usually query on?

That is,, if I regularly query on the first 3 of the 5 primary key columns, should I create an additional index for these 3, or is that redundant because it's already part of the primary key index?

Upvotes: 4

Views: 1694

Answers (2)

O. Jones
O. Jones

Reputation: 108806

Ten seconds is quite a long time for a query that returns one or a tiny handful of rows. If the query is returning 3% of the table's contents, though, ten seconds is not too long.

Your primary unique key is backed up by a composite index, let's say an index on

  (I1,I2,I3,I4,I5)

You are correct that a query like

  WHERE I1 = val AND I2 = val AND I3 = val

and

  WHERE I3 = val AND I2 = val AND I1 = val

should use the index created for the primary key. The important thing is that the columns in the composite index are all used, starting with the leftmost one. A query like

  WHERE I3 = val AND I4 = val AND I5 = val

won't use the primary key's composite index very well, if at all. Neither will a query that does some kind of computation on the column values mentioned in the key, like

  WHERE I1+I2+I3=sumvalue

Keep in mind that "should work" is not the same as "does work." Try using the EXPLAIN command in MySQL to figure out whether the DBMS is doing what you expect it to for your query.

http://dev.mysql.com/doc/refman/5.1/en/explain.html

Upvotes: 3

Martijn
Martijn

Reputation: 3754

Why not just create a few test queries, create the index on a copy of the table and see how it performs?

When it comes to performance, measuring is always better than trusting an opinion.

The "best" solution in a database largely depends on the specific details of the table(s) involved. What range of values in the columns, what distribution of values, what type of queries, relative frequency of select/delete/insert/update queries, etc.

That being said, my guess is that an index on a subset will help if that subset contains all columns used in a query. You might get better performance if you include the result set (column in the select) in the index.

Upvotes: 0

Related Questions