Reputation: 13
Q1: why count(*) is so much slower than count(distinct col)?
Q2: should id always use count(distinct col)?
select count(id) from source;
+-----------+
| count(id) |
+-----------+
| 22713 |
+-----------+
1 row in set (0.73 sec)
mysql> select count(distinct id) from source;
+--------------------+
| count(distinct id) |
+--------------------+
| 22836 |
+--------------------+
1 row in set (0.08 sec)
Upvotes: 1
Views: 1179
Reputation: 31
1) Make sure query result is not cached
2) Seems like ID column has NULL params and index. In that case count(id) give count for id with NOT NULL values . If column ID has no NULL param - use COUNT(*). It give you rows count without checking "column !== null" for each row
Upvotes: 0
Reputation: 6054
It may also be faster if it the query is cached by the mysql
This is my test with about 1.5 million rows, and id is the auto_increment PK
Upvotes: 0
Reputation: 780724
If the column is indexed, COUNT(DISTINCT id)
just needs to return the number of items in the index for the column. COUNT(id)
has to add up the number of rows that each index entry points to, or scan all the rows.
For your second question, see count(*) and count(column_name), what's the diff?. Most of the time, COUNT(*)
is most appropriate; there are some situations, such as counting rows joined with an outer join, where you need to use COUNT(columnname)
because you don't want to count the null rows.
Upvotes: 1