Reputation: 1927
I am confused as to how best to index a table in MySQL and need help on the best type of index construction to use. Currently I am using a unique-key index on this table but do not know if this is the best approach to use and in some situations I cannot use this type of indexing due to MySQL limitations.
The table consists of a primary key and n-columns, in this scenario to keep it simple n=4. So the table looks like this: pk, col1, col2, col3, col4
The values in col1-n are VARCHARs typically with a length between 1 to 4 characters. The primary key is a concatenation of the col values. So typical rows could look like the following:
A:B:C:D, A, B, C, D
A:B:C:E, A, B, C, E
A:B:F:F, A, B, F, F
Where the first element is the primary key, and subsequent elements are col1, col2, etc.
The table needs to be optimised for queries, not inserts. The queries that I wish to perform will have a WHERE clause where we know some of the values in columns 1-4. So for example I might want to find all rows where the second column is 'B' or 'C'. Once I have the primary key I use this to JOIN another table.
I was creating a unique key on col1-4 (as they are unique). The problem is, as soon as n becomes large (>16), I can no longer create a unique key index (MySQL is limited to 16 columns for unique key constraints). This is not a problem as the primary key ensures uniqueness. However, I am unsure of two things:
a) Is the unique key a good index to use in order to optimise the speed of the queries?
b) When I can not use a unique key, what index should I use?
I have the following options, and I’m not sure which (if any) is the best:
a) Create a single index on (col1, col2, col3, col4)
b) Create an index per column (col1), (col2)…(col-n)
c) Create an index per col, with the pk included (pk, col1), (pk, col2), (pk, col-n)
Any help you can provide is greatly appreciated.
Thanks
Phil
Upvotes: 2
Views: 1247
Reputation: 31675
An index on (col1, col2, col3, col4)
can only be used, if the WHERE
clause contains a condition on the first columns. That means, if the query does not contain a condition on col1
, the index cannot be used at all (see Multiple-Column Indexes). If you have such queries, additional indices should be defined. These might be (col2, col3, col4)
, (col3, col4)
and (col4)
.
On the other hand, separate indices on (col1)
, (col2)
, (col3)
and (col4)
are also a good choice. Int that case, there is no need to include the primary key in the indices. I'd prefer this solution over the solution mentioned above.
I find your choice of primary key strange. If (col1, col2, col3, col4)
is unique, use that as a primary key. If you do not want a primary key on four columns (most people don't), the next choice is often a surrogate key (i.e. an auto_increment
column in MySQL). In that case, a unique key on (col1, col2, col3, col4)
enforces data integrity.
Upvotes: 2
Reputation: 425613
MySQL
is able to merge join several indexes within a single table on PK, as long as you are searching for exact key values (not ranges).
So if you create separate indexes on col1
to colN
, you may run this query:
SELECT *
FROM mytable
WHERE col2 = 'B'
OR
col3 = 'C'
which will result in the indexes on col2
and col3
merge joined (you will see it as index_merge using union(col2, col3)
in the EXPLAIN
output).
To ensure uniqueness, it's enough to declare your first column the PRIMARY KEY
, so as long as you maintain your data consistency (PK value is indeed the col*
values concatenated and separated), your data uniqueness will be policed by the PK.
Upvotes: 0