Allen King
Allen King

Reputation: 2516

Indexes on a database table, one column common

I am trying to understand if it makes sense to have two separate indexes in the following scenario:

ColumnA, ColumnB, ColumnC

I have queries

1.  where ColumnA = xxx and columnB = xxx
2.  where ColumnA = xxx and ColumnC = xxx

If I create only one index, that is on ColumnA, will that help in both queries? Or shall I create two indexes Index1 on ColumnA + ColumnB and Index2 on ColumnA + ColumnC.

I understand having two index will provably better but I am trying to keep the number of indexes low because the table is rather big but columnA is fairly unique. ColumnA filters data of a particular entity and that entity can delve into that data only always.

Also, if there an index on ColumnA + ColumnB and if a query comes in which ColumnB is first and ColumnA second, will this index be used?

Upvotes: 1

Views: 69

Answers (2)

Rick James
Rick James

Reputation: 142356

(@Haney discusses the Question from one point of view; here is another.)

2 indexes is not 'bad'; 10 indexes is getting into a gray area of "too many indexes".

INDEX(A) helps with both of your queries.

INDEX(A,B) is very good for one of your queries, while helping with the other query. If you wanted to keep it to a single index, this might be the best choice.

But... if B is a TEXT column, you won't be allowed to use INDEX(A,B) because of size limitations. And, using a "prefix" INDEX(A, B(22)), though possible, may not be any better than INDEX(A).

Don't bother with INDEX(A,B,C). This is good for the query with A and B, but it is not any better than INDEX(A) for the other query.

Upvotes: 2

Haney
Haney

Reputation: 34842

Generally, creating an index on ColumnA only should help both queries. Indexes in most RDBMS things (MSSQL, MySQL, etc.) are b-tree structures. The key allows for fast looking down the line, in one direction only.

Additionally, creating a deeper index such as ColumnA, ColumnC should also help both queries, since the ColumnA component is still indexed first.

I would suggest evaluating which column is selected most frequently: ColumnB or ColumnC and applying an index to it with ColumnA.

An example: let's say that ColumnB is accessed in only 10% of the ColumnA queries whereas ColumnC is accessed 90%. In this case I'd set the index on ColumnA, ColumnC. It will help 100% of the A-C queries, and will probably (I'm not 100% certain in MySQL) help the A-B query as well since the system is generally smart enough (at least in MSSQL) to use the A-C index to select the ColumnA data in the A-B query (but still doing a full scan for the ColumnB component).

These kinds of indexes by the way are called covering indexes because your query selects only the columns included in the index data (which is a slight optimization as well).

The best read-performance will be 2 indexes (one for each set), however as you correctly noted this will slow down inserts, updates, and deletes a little. Not enough that you're likely to notice in most cases though.

Upvotes: 2

Related Questions