Fred Willmore
Fred Willmore

Reputation: 4604

Counting records in mysql innodb table

I am working on an app in Rails using mysql with innodb. I have a need to fetch the full table count quite often and I understand that with innodb, counting all the records in a table can be quite expensive as it requires a full table scan. I'm looking at a typical table in the rails console and looking at the query times for counting the records. What I find so far is that the first time counting the records it takes a long time, but on subsequent tries it is much faster. For example:

2.2.2 :002 > Request.count
   (683.7ms)  SELECT COUNT(*) FROM `requests`
 => 260588
2.2.2 :003 > Request.count
   (47.6ms)  SELECT COUNT(*) FROM `requests`
 => 260588
2.2.2 :004 > Request.count
   (46.7ms)  SELECT COUNT(*) FROM `requests`
 => 260588

So, first off, do I even need to worry about optimizing this result? Perhaps the count is being cached by Rails or mySql or InnoDB and there's nothing to worry about.

I'm going to assume that there still is something to worry about in a production environment where multiple users are writing to the table at any given time. In that case, how do I 'reset' the caching or whatever is giving my an unrealistically sunny outlook on the count times so that I can do some honest benchmarking? I've tried writing to a record, or just reloading the console with reload!, but I never get the initial long time. I'll bet if I quit the console and restart mysql that would do it, but I'd rather not have to work that hard.

Finally, I've heard that a query like the following will run faster:

select count(*) from requests use index(<index_name>);

It seems like the most natural index to use is the id field.

select count(*) from service_requests use index(id)

But this gives me the following error:

ERROR 1176 (42000): Key 'id' doesn't exist in table 'requests'

But id is not just a key, it's the primary key. On some tables it's the only index. Why is id not considered a key?

Upvotes: 0

Views: 127

Answers (2)

Rick James
Rick James

Reputation: 142356

  • Don't run OPTIMIZE TABLE on an InnoDB table; it almost never provides any improvement.
  • Don't use USE INDEX or FORCE INDEX, except as a last resort. It may help you today, but it will probably make things much worse tomorrow. The Optimize dynamically picks the "best" index, and it is usually 'correct' in its choice.
  • The PRIMARY KEY is probably the worst index to use for a full table COUNT(*).
  • SELECT COUNT(*) takes time (for InnoDB) because it must scan the entire table, bypassing any transactions that are in progress.
  • Such a scan will pick the 'smallest' index, so it may help to create INDEX(foo) where foo is some small column. However, this is unlikely to more than double the speed.
  • "684s, 47s, 47s" -- Probably the first request did not find all the data cached in RAM and had to fetch it from disk. The 2nd and 3rd were much faster because of caching.
  • Configure mysql's innodb_buffer_pool_size to be about 70% of available RAM in order to cut back on I/O.
  • If there are so many rows in results that it takes "too long", then the number is probably meaningless. Notice how some search engines no longer say "10 out of 1,234,566 results" or even "10 out of about 1,000,000 results". That is because they found it not worth computing or even estimating.
  • So, you next step (assuming you are unwilling to get rid of the count) is to come up with some way of estimating or caching the count.
  • If you SELECT has a WHERE clause, then much of my answer does not apply.

Upvotes: 0

cp50
cp50

Reputation: 164

To optimise it, run optimize table and then you could create another index that isn't the primary key as InnoDB uses clustered primary keys.

When creating another index you have to create it on another field which is easy to scan, not something like a text field (if it applies to requests).

The error - if you continue to use the primary key as opposed to creating another index for whichever reason:

As you mention that the primary key is indeed id then using use index(id) on id (the PK in your table from which the query comes from) won't work, it will be USE INDEX (PRIMARY). To see other indexes for the table, run the SHOW INDEX FROM <Table> command, and this will show the other index names for the table.

Further info: http://dev.mysql.com/doc/refman/5.7/en/optimize-table.html http://dev.mysql.com/doc/refman/5.7/en/index-hints.html

Upvotes: 1

Related Questions