Reputation: 13

mysql COUNT(*) vs COUNT(DISTINCT col)

Q1: why count(*) is so much slower than count(distinct col)?

Q2: should id always use count(distinct col)?

select count(id) from source;
+-----------+
| count(id) |
+-----------+
|     22713 |
+-----------+
1 row in set (0.73 sec)

mysql> select count(distinct id) from source;
+--------------------+
| count(distinct id) |
+--------------------+
|              22836 |
+--------------------+
1 row in set (0.08 sec)

Upvotes: 1

Answers (3)

Vitaly

Reputation: 31

1) Make sure query result is not cached

2) Seems like ID column has NULL params and index. In that case count(id) give count for id with NOT NULL values . If column ID has no NULL param - use COUNT(*). It give you rows count without checking "column !== null" for each row

Upvotes: 0

the_nuts

Reputation: 6074

It may also be faster if it the query is cached by the mysql

This is my test with about 1.5 million rows, and id is the auto_increment PK

Upvotes: 0

Barmar

Reputation: 782785

If the column is indexed, COUNT(DISTINCT id) just needs to return the number of items in the index for the column. COUNT(id) has to add up the number of rows that each index entry points to, or scan all the rows.

For your second question, see count(*) and count(column_name), what's the diff?. Most of the time, COUNT(*) is most appropriate; there are some situations, such as counting rows joined with an outer join, where you need to use COUNT(columnname) because you don't want to count the null rows.

Upvotes: 1

mysql COUNT(*) vs COUNT(DISTINCT col)

Answers (3)

Related Questions