Reputation: 3807

Best way to index a table with a unique multi-column?

I am creating a table which will store around 100million rows in MySQL 5.6 using InnoDB storage engine. This table will have a foreign key that will link to another table with around 5 million rows.

Current Table Structure:

`pid`: [Foreign key from another table]

`price`: [decimal(9,2)]

`date`: [date field]

and every pid should have only one record for a date

What is the best way to create indexes on this table?

Option #1: Create Primary index on two fields pid and date

Option #2: Add another column id with AUTO_INCREMENT and primary index and create a unique index on column pid and date

Or any other option?

Only select query i will be using on this table is:

SELECT pid,price,date FROM table WHERE pid = 123

Upvotes: 1

Answers (3)

Rick James

Reputation: 142366

Based on what you said (100M; the only query is...; InnoDB; etc):

PRIMARY KEY(pid, date);

and no other indexes

Some notes:

Since it is InnoDB, all the rest of the fields are "clustered" with the PK, so a lookup by pid is acts as if price were part of the PK. Also WHERE pid=123 ORDER BY date would be very efficient.
No need for INDEX(pid, date, price)
Adding an AUTO_INCREMENT gains nothing (except a hint of ordering). If you needed ordering, then an index starting with date might be best.
Extra indexes slow down inserts. Especially UNIQUE ones.

Upvotes: 1

Roberto Conde

Reputation: 3277

I would try with an index that attempts to cover the query, in the hope that MySQL has to access to the index only in order to get the result set.

ALTER TABLE `table` ADD INDEX `pid_date_price` (`pid` , `date`, `price`);

ALTER TABLE `table` ADD INDEX `pid_price_date` (`pid` , `price`, `date`);

Choose the first one if you think you may need to select applying conditions over pid and date in the future, or the second one if you think the conditions will be most probable over pid and price.

This way, the index has all the data the query needs (pid, price and date) and its indexing on the right column (pid)

By the way, always use EXPLAIN to see if the query planner will really use the whole index (take a look at the key and keylen outputs)

Upvotes: 0

Gordon Linoff

Reputation: 1270391

Either method is fine. I prefer having synthetic primary keys (that is, the auto-incremented version with the additional unique index). I find that this is useful for several reasons:

You can have a foreign key relationship to the table.
You have an indicator of the order of insertion.
You can change requirements, so if some pids allows two values per day or only one per week, then the table can support them.

That said, there is additional overhead for such a column. This overhead adds space and a small amount of time when you are accessing the data. You have a pretty large table, so you might want to avoid this additional effort.

Upvotes: 1

Best way to index a table with a unique multi-column?

Answers (3)

Related Questions