HGPB
HGPB

Reputation: 4366

What's the best way to count MySQL records

I have a search engine on a shared host that uses MySQL. This search engine potentially has millions/trillions etc of records.

Each time a search is performed I return a count of the records that can then be used for pagination purposes.

The count tells you how many results there are in regard to the search performed. MySQL count is I believe considered quite slow.

Order of search queries:

  1. Search executed and results returned
  2. Count query executed

I don't perform a PHP count as this will be far slower in larger data sets.

Question is, do I need to worry about MySQL "count" and at what stage should I worry about it. How do the big search engines perform this task?

Upvotes: 2

Views: 1022

Answers (4)

N.B.
N.B.

Reputation: 14060

If MySQL database reaches several millions of records, that's a sign you'll be forced to stop using monolithic data store - meaning you'll have to split reads, writes and most likely use a different storage engine than the default one.

Once that happens, you'll stop using the actual count of the rows and you'll start using the estimate, cache the search results and so on in order to alleviate the work on the database. Even Google uses caching and displays an estimate of number of records.

Anyway, for now, you've got 2 options:

1 - Run 2 queries, one to retrieve the data and the other one where you use COUNT() to get the number of rows.

2 - Use SQL_CALC_FOUND_ROWS like @JohnFX suggested.

Percona has an article about what's faster, tho it might be outdated now.

The biggest problem you're facing is the way MySQL uses LIMIT OFFSET, which means you probably won't like your users using large offset numbers.

In case you indeed get millions of records - I don't forsee a bright future for your MySQL monolithic storage on a shared server. However, good luck to you and your project.

Upvotes: 2

JohnFx
JohnFx

Reputation: 34909

In almost all cases the answer is indexing. The larger your database gets the more important it is to have a well designed and optimized indexing strategy.

The importance of indexing on a large database can not be overstated.

You are absolutely right about not looping in code to count DB records. Your RDBMS is optimized for operations like that, your programming language is no. Wherever possible you want to do any sorting, grouping, counting, filtering operations within the SQL language provided by your RDBMS.

As for efficiently getting the count on a "paginated" query that uses a LIMIT clause, check out SQL_CALC_FOUND_ROWS.

SQL_CALC_FOUND_ROWS tells MySQL to calculate how many rows there would be in the result set, disregarding any LIMIT clause. The number of rows can then be retrieved with SELECT FOUND_ROWS(). See Section 11.13, “Information Functions”.

Upvotes: 5

user610217
user610217

Reputation:

If I understand what you are trying to do properly, you can execute the one query, and perform the mysql_num_rows() function on the result in PHP... that should be pretty zippy.

http://php.net/manual/en/function.mysql-num-rows.php

Upvotes: 0

Jonathan M
Jonathan M

Reputation: 17441

Since you're using PHP, you could use the mysql_num_rows method to tell you the count after the query is done. See here: http://www.php.net/manual/en/function.mysql-num-rows.php

Upvotes: -1

Related Questions