Nikita Volkov
Nikita Volkov

Reputation: 43350

`Count Distinct` and `Group By` produce weird results

Please consider the following query:

SELECT artist.id, COUNT(DISTINCT artist$styles.v_id)
FROM artist
LEFT JOIN artist$styles ON artist$styles.p_id = artist.id

This is the result I get:

id  count
1   4

The questions are:

  1. How come it's only selecting one row from the artist table, when there are 4 rows in it and there are no WHERE, HAVING, LIMIT or GROUP BY clauses applied to the query?
  2. There are only three records in artist$styles having p_id of value 1, why is it counting 4?
  3. Why if I add a GROUP BY clause to it I get the correct results?

    SELECT artist.id, COUNT(DISTINCT artist$styles.v_id)
    FROM artist
    LEFT JOIN artist$styles ON artist$styles.p_id = artist.id
    GROUP BY artist.id
    ----
    id  count
    1   3
    2   1
    3   3
    4   1
    
  4. This all just doesn't make sense to me. Could this be a bug of MySQL? I'm running Community 5.5.25a

Upvotes: 0

Views: 2591

Answers (3)

Ajay Kadyan
Ajay Kadyan

Reputation: 1079

  1. Count Function return single row result if you are not using group by clause and that's why its returning one row.

2.In your output

id  count
1   4

4 is total no of results in that table not result for id 1.and it display in front of 1 because only one row produce.

3.when you use group by then a group of that column value is created that's why you get that output.

And finally its not a bug.Mysql provide a proper documentation for that you can read on mysql site.

Upvotes: 0

eggyal
eggyal

Reputation: 126025

  1. As stated in the manual page on aggregate functions (of which COUNT() is one):

    If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows.

  2. As stated in the manual page on GROUP BY with hidden columns:

    The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.

    In other words, the server has chosen one (indeterminate) value for column p_id, which happens in this case to be the value 1, whilst it has properly aggregated and counted the result for the COUNT() function.

  3. Because you are then grouping on the correct columns, rather than on all rows.

  4. It's not a bug; this behaviour is documented and by design.

Upvotes: 3

Madhivanan
Madhivanan

Reputation: 13700

It is a possible bug in Mysql. All non aggeregate columns should be included in Group by clause. MySQL does not force this and the result is unpredictable and hard to debug. As a rule always include all non-aggregate columns in the Group by clause. This is how all RDBMSs work

Upvotes: 0

Related Questions