Shuai Li
Shuai Li

Reputation: 2776

Why select count(*) from table_name is so slow?

mysql> select count(*)
    -> from ip_address_varchar20;
+----------+
| count(*) |
+----------+
|  2764687 |
+----------+
1 row in set (1 min 28.80 sec)

I thought there should be a field to store the size of any table, but I find that count(*) is so slow.

Why DBMS do not optimize this? Or I do some bad practice on this?

Upvotes: 0

Views: 1804

Answers (2)

Michael Buen
Michael Buen

Reputation: 39463

From: https://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007

One operation that PostgreSQL is known to be slow performing is doing a full count of rows in a table, typically using this SQL:

SELECT COUNT(*) FROM table

The reason why this is slow is related to the MVCC implementation in PostgreSQL. The fact that multiple transactions can see different states of the data means that there can be no straightforward way for "COUNT(*)" to summarize data across the whole table; PostgreSQL must walk through all rows, in some sense. This normally results in a sequential scan reading information about every row in the table.

The explanation above for Postgres applies to MySQL's InnoDB as well. As InnoDB uses MVCC.

There are some approaches to do fast count with InnoDB. E.g., You can use estimate, or you can use triggers to maintain the table's row count.

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1271003

Some databases do store the size of a table externally. However, most databases read all the data -- either directly or through an index -- in order to satisfy a COUNT() query.

This ensures that the rows are accurately counted. It also takes transactional semantics into account.

Upvotes: -1

Related Questions