Reputation: 205
I have the following data:
countryId languageUsed
1 English
1 France
2 English
2 Spanish
3 Italian
1 Russian
When I query:
SELECT count(DISTINCT countryId)
FROM languages
GROUP BY countryId
I get:
count(DISTINCT countryId)
1
1
1
Instead of just 3. When I remove the GROUP BY
clause it will give me the right results.
Can somebody explain what is the reason for this type of behavior.
Upvotes: 0
Views: 2544
Reputation: 2034
The purpose of GROUP BY
is taking several rows and turns them into one row.If this so, it has to know what to do with all the combined rows where there are different values. This is why you have two options for every field you want to SELECT
which is either include it in the GROUP BY
clause, or use it in an aggregate function so the system knows how you want to combine the field.
You're also experiencing a strict requirement of the GROUP BY
clause. Which is every column not in the GROUP BY
clause must have a function applied to reduce all records for the matching "group" to a single record such as SUM
, MAX
, MIN
, AVG
and others.
If you list all selected columns in the GROUP BY
clause, you are essentially requesting that duplicate records be excluded from the result set. That gives the same effect as SELECT DISTINCT
which also eliminates duplicate rows from the result set.
Unless if you desperately want to use them both and get you're desired result, you can just use SUM
. See below:
SELECT SUM(countryId) FROM
(
SELECT COUNT(DISTINCT countryId)countryId
FROM languages
GROUP BY countryId
) AS A
Upvotes: 1
Reputation: 1576
You shouldn't use count()
and group by
to the same column
in a query if you only want to count distinct countryId
. The result is like that because it distribute the returned count(DISTINCT countryId)
to multiple row
(because of the GROUP BY countryId
clause) instead of returning only a single row
. The code count(DISTINCT countryId)
will suffice.
Upvotes: 0