Sanco
Sanco

Reputation: 13

mysql COUNT(*) vs COUNT(DISTINCT col)

Q1: why count(*) is so much slower than count(distinct col)?

Q2: should id always use count(distinct col)?

select count(id) from source;
+-----------+
| count(id) |
+-----------+
|     22713 |
+-----------+
1 row in set (0.73 sec)

mysql> select count(distinct id) from source;
+--------------------+
| count(distinct id) |
+--------------------+
|              22836 |
+--------------------+
1 row in set (0.08 sec)

Upvotes: 1

Views: 1179

Answers (3)

Vitaly
Vitaly

Reputation: 31

1) Make sure query result is not cached

2) Seems like ID column has NULL params and index. In that case count(id) give count for id with NOT NULL values . If column ID has no NULL param - use COUNT(*). It give you rows count without checking "column !== null" for each row

Upvotes: 0

the_nuts
the_nuts

Reputation: 6054

It may also be faster if it the query is cached by the mysql

This is my test with about 1.5 million rows, and id is the auto_increment PK

enter image description here

Upvotes: 0

Barmar
Barmar

Reputation: 780724

If the column is indexed, COUNT(DISTINCT id) just needs to return the number of items in the index for the column. COUNT(id) has to add up the number of rows that each index entry points to, or scan all the rows.

For your second question, see count(*) and count(column_name), what's the diff?. Most of the time, COUNT(*) is most appropriate; there are some situations, such as counting rows joined with an outer join, where you need to use COUNT(columnname) because you don't want to count the null rows.

Upvotes: 1

Related Questions