lmao
lmao

Reputation: 462

Indexing a column having duplicate values

Consider these tables:

seed (seedid, seedname) # PK-(seedid) 
stock (seedid, stk, storeid)  # PK-(seedid,storeid), FK-(storeid)
#InnoDB

stock: 
      seedid, stk, storeid
         1    12     81
         2    13     81
         3    14     81

         1    12     82
         2    11     82
         3    13     82
         4    12     82

Query -> select stk from stock where seedid = 'aaa' and storeid = 'yyy'.

Table stock represent stocks of several stores, hence storeid will be repeated over.

How do I index, table stock, given that it will be queried frequently using storeid ?

Primary keys as automatically indexed, and since (seedid,storeid) is already a Primary key, is there no need to index it further?

Upvotes: 1

Views: 6135

Answers (2)

spencer7593
spencer7593

Reputation: 108370

Based on the incomplete specification, I would do this:

   CREATE UNIQUE INDEX stock_UX1 ON stock (storeid,seedid,stk)

This index would satisfy the requirement for an index with storeid as the leading column. (And we know will have that requirement if this is InnoDB and storeid is a foreign key.)

With such a short table row, I'd go ahead and make it a covering index, and include all of the columns. Then queries can be satisfied directly from the index pages without lookups to data pages in the underlying table.

Since we know that (seedid,storeid) is unique (given as the PRIMARY KEY), we know (storeid,seedid) is also unique, so we might as well declare the index to be UNIQUE.


There are other choices; we don't have to create that index above. We could just do this instead:

   CREATE INDEX stock_IX2 ON stock (storeid)

But that will use nearly the same amount of space, and not be as beneficial to as many possible queries.


The secondary index will contain the primary key of the table; so that second index will include the seedid column, given the PRIMARY KEY of the table. That is, the index is equivalent to this:

   CREATE INDEX stock_IX3 ON stock (storeid,seedid)

And we know the combination of those two columns is unique, so we can include the UNIQUE keyword

   CREATE UNIQUE INDEX stock_UX4 ON stock (storeid,seedid)

If we do an EXPLAIN on a query of the form

  EXPLAIN
  SELECT t.storeid
       , t.seedid
       , t.stk 
    FROM stock t 
   WHERE t.storeid = 'foo'

we are likely to see a range scan operation on the secondary index; but retrieving the value of stk column will require lookup to the data pages in the underlying table. Including the stk column in the secondary index will make the index a covering index for the query. With the index first recommended in the answer, we expect the EXPLAIN output to show "Using index".

Upvotes: 4

SilicaGel
SilicaGel

Reputation: 459

If seedid,storeid is set as a primary key, then it has a unique index already. However, queries can only use one index at a time. So if you are making regular queries for say something like 'where seedid=3 and storeid=5', that index would be used. But if you just use 'where storeid=5' it might not use the index. (You can use 'explain' in front of your query to see what indexes mysql is going to use.)

So basically you want an index that covers the columns that you need to query regularly in your where clause.

Upvotes: 1

Related Questions