snh_nl
snh_nl

Reputation: 2955

How can I detect if an MySQL index is necessary or required?

How can I detect if an MySQL index is necessary or required?

We have the idea that some queries can be improved. And I know that I can dive in slow query logs ... but I ran across the post below for MS SQL and was wondering if there is an easy way of analyzing if an index is required (and will give immediate speed improvements) for the current MySQL database.

Help appreciated

Resource for MS SQL: https://dba.stackexchange.com/questions/56/how-to-determine-if-an-index-is-required-or-necessary

Upvotes: 2

Views: 5682

Answers (2)

Rick James
Rick James

Reputation: 142296

You can't.

There are ways to detect, over a period of time, whether an index is used. But there is no way to be sure that an index is not used. Let's say you have a once-a-month task that does some major maintenance on the table. And you really need a certain index to keep the task from locking the table and bringing down the application. If you checked for index usage for most of the month, but failed to include that usage, you might decide that you don't need the index. Then you would drop the index... and be sorry. (This is a real anecdote.)

Meanwhile, there are some simplistic rules about indexes...

  • INDEX(a) is unnecessary if you also have INDEX(a,b).
  • INDEX(id) is unnecessary if you also have PRIMARY KEY(id) or UNIQUE(id).
  • An index with 5 or more columns may be used, but is unlikely to be "useful". (Shorten it.)
  • INDEX(a), INDEX(b) is not the same as INDEX(a,b).
  • INDEX(b,a) is not the same as INDEX(a,b); you may need both.
  • INDEX(flag), where flag has a small number of distinct values, will probably never be used -- the optimizer will scan the table instead.
  • In many cases, "prefix" indexing (INDEX(foo(10))) is useless. (But there are many exceptions.)
  • "I indexed every column" -- a bad design pattern.
  • Often, but not always, having both a PRIMARY KEY and a UNIQUE key means that something is less than optimal.
  • InnoDB tables really should have an explicit PRIMARY KEY.
  • InnoDB implicitly include the PK in any secondary key. So, given PRIMARY KEY(id), INDEX(foo) is really INDEX(foo, id).
  • Sometimes the Optimizer will ignore the WHERE clause and use an index for the ORDER BY.
  • Some queries have such skewed properties that the Optimizer will use a different index depending on different constants. (I have literally see as many as 6 different explain plans for one query.)
  • "Index merge intersect" is almost always not as good as a composite index.
  • There are exceptions to most of these tips.

So, I prefer to take all the queries (SELECTs, UPDATEs, and DELETEs), decide on the optimal index for each, eliminate redundancies, etc, in order to find the "best" set of indexes. See my cookbook on creating an index, given a SELECT.

Upvotes: 10

Deepak Dholiyan
Deepak Dholiyan

Reputation: 1912

You should definitely spend some time reading up on indexing, there's a lot written about it, and it's important to understand what's going on.

Broadly speaking, and index imposes an ordering on the rows of a table.

For simplicity's sake, imagine a table is just a big CSV file. Whenever a row is inserted, it's inserted at the end. So the "natural" ordering of the table is just the order in which rows were inserted.

Imagine you've got that CSV file loaded up in a very rudimentary spreadsheet application. All this spreadsheet does is display the data, and numbers the rows in sequential order.

Now imagine that you need to find all the rows that has some value "M" in the third column. Given what you have available, you have only one option. You scan the table checking the value of the third column for each row. If you've got a lot of rows, this method (a "table scan") can take a long time!

Now imagine that in addition to this table, you've got an index. This particular index is the index of values in the third column. The index lists all of the values from the third column, in some meaningful order (say, alphabetically) and for each of them, provides a list of row numbers where that value appears.

Now you have a good strategy for finding all the rows where the value of the third column is "M". For instance, you can perform a binary search! Whereas the table scan requires you to look N rows (where N is the number of rows), the binary search only requires that you look at log-n index entries, in the very worst case. Wow, that's sure a lot easier!

Of course, if you have this index, and you're adding rows to the table (at the end, since that's how our conceptual table works), you need to update the index each and every time. So you do a little more work while you're writing new rows, but you save a ton of time when you're searching for something.

So, in general, indexing creates a tradeoff between read efficiency and write efficiency. With no indexes, inserts can be very fast -- the database engine just adds a row to the table. As you add indexes, the engine must update each index while performing the insert.

On the other hand, reads become a lot faster.

Upvotes: 0

Related Questions