Reputation: 3807
I am creating a table which will store around 100million rows in MySQL 5.6 using InnoDB storage engine. This table will have a foreign key that will link to another table with around 5 million rows.
Current Table Structure:
`pid`: [Foreign key from another table]
`price`: [decimal(9,2)]
`date`: [date field]
and every pid
should have only one record for a date
What is the best way to create indexes on this table?
Option #1: Create Primary index on two fields pid
and date
Option #2: Add another column id
with AUTO_INCREMENT and primary index
and create a unique index on column pid
and date
Or any other option?
Only select query i will be using on this table is:
SELECT pid,price,date FROM table WHERE pid = 123
Upvotes: 1
Views: 139
Reputation: 142366
Based on what you said (100M; the only query is...; InnoDB; etc):
PRIMARY KEY(pid, date);
and no other indexes
Some notes:
price
were part of the PK. Also WHERE pid=123 ORDER BY date
would be very efficient.AUTO_INCREMENT
gains nothing (except a hint of ordering). If you needed ordering, then an index starting with date
might be best.UNIQUE
ones.Upvotes: 1
Reputation: 3277
I would try with an index that attempts to cover the query, in the hope that MySQL has to access to the index only in order to get the result set.
ALTER TABLE `table` ADD INDEX `pid_date_price` (`pid` , `date`, `price`);
or
ALTER TABLE `table` ADD INDEX `pid_price_date` (`pid` , `price`, `date`);
Choose the first one if you think you may need to select applying conditions over pid and date in the future, or the second one if you think the conditions will be most probable over pid and price.
This way, the index has all the data the query needs (pid, price and date) and its indexing on the right column (pid)
By the way, always use EXPLAIN to see if the query planner will really use the whole index (take a look at the key and keylen outputs)
Upvotes: 0
Reputation: 1270391
Either method is fine. I prefer having synthetic primary keys (that is, the auto-incremented version with the additional unique index). I find that this is useful for several reasons:
pid
s allows two values per day or only one per week, then the table can support them.That said, there is additional overhead for such a column. This overhead adds space and a small amount of time when you are accessing the data. You have a pretty large table, so you might want to avoid this additional effort.
Upvotes: 1