Reputation: 151
Here is what I have :
table content : cat_id product_id data1 data2 etc.
the categories are not unique obviously.
the product ids are unique.
2 queries :
1 -- SELECT * WHERE cat_id = :cat - must be as quick as possible
2 -- SELECT * WHERE product_id = :prodId
In second select, I can add : AND cat_id = :cat
What is the more efficient ?
For information, I'll have around 20 products in each category and a lot of categories (say 3000) - And (as it is unique in table) ONE product belongs to only ONE category - In fact, that is not really cats and products, that is for the simplicity of explaination;)
thanks!
Upvotes: 4
Views: 2559
Reputation: 142218
If these are your only two queries:
SELECT * FROM tablename WHERE cat_id = :cat_id and product_id = :prodId
SELECT * FROM tablename WHERE cat_id = :cat_id
and you have some other way to assure that product_id
is UNIQUE
, then you need only:
PRIMARY KEY(cat_id, product_id)
It is optimal for both SELECTs
.
It is better than INDEX(cat_id)
because (1) secondary keys have to finish their work with a PK lookup, and (2) all the cat rows are adjacent, thereby more efficient.
If product_id
is actually an AUTO_INCREMENT
, then add
INDEX(product_id)
No, you don't need to say UNIQUE
(unless you are prone to deliberately trying to insert duplicate product_ids
). The only thing that AI requires is that the id be first in some index so that it can do the equivalent of SELECT max(id)
whenever mysqld is restarted.
My recommendations apply regardless of the size of the table.
The order of clauses in WHERE
does not matter.
JOINs
do not require anything in particular. It is slightly more efficient to JOIN
on a PRIMARY KEY
than on a secondary key, which is a lot more efficient (but still possible) than on a non-indexed column(s).
Upvotes: 1
Reputation: 53734
A database without Primary key is only half dressed and according to you product_id is an ideal candidate for a primary key, so let choose that. The primary key will be used in
SELECT * WHERE product_id = :prodId
It does not matter if and cat_id = :cat_id
becomes a part of the query or not unless you have thousands of cat_ids
associated with each product_id
.
Then choose an index on cat_id. This will be used on
SELECT * WHERE cat_id = :cat
This will be very quick if the cardinality of the data is good. That means there is a wide distribution of cat_id
s in the table. The index of cat_id
will not be used in the first query. Thus you have two different indexes and both queries can be expected to be really quick.
Since [cat_id+product_id] != [product_id+cat_id]
when it comes to indexing, if you have only one composite index one or the other will be slow.
For example, suppose we had a composite index on (cat_id, product_id) now the following query cannot make use of this index.
SELECT * FROM tablename WHERE product_id = :prodId
But both these queries can use the (cat_id, product_id) index
SELECT * FROM tablename WHERE cat_id = :cat_id and product_id = :prodId
SELECT * FROM tablename WHERE cat_id = :catId
So in summary. Choose 1 and 2. But if the number of cat_id
s are small or there are lots of cat_ids associated with each product_id
choose 4 but make sure that the primary key is also in place.
Upvotes: 5