Reputation: 50989
Recently I found that despite the fact that patientID
is duplicating in my Samples
table, the following query works
SELECT * FROM Samples GROUP BY patientID
and returns multiple values for multiple columns.
What aggregation function it uses by default?
Upvotes: 1
Views: 205
Reputation: 9042
Since you've not specified the version of the MySQL server, there are two possible answers.
Prior MySQL 5.7.5, the above query is valid, but with the following comment for all the columns not listed in GROUP BY nor aggregated:
The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.
(https://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html)
Since MySQL 5.7.5, this behaviour was changed and MySQL implements the SQL99 standard:
SQL99 and later permits such nonaggregates per optional feature T301 if they are functionally dependent on
GROUP BY
columns
(https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html)
So some columns could be valid, however the query itself is not valid, since not all columns are functionally dependent on the patientID column (there could be both blood and skin sample).
In general, it is a bad behaviour to use SELECT *
, and to not define what to do with all the resulting columns in an aggregating query.
TL;DR; MySQL prior 5.7.5 will execute the query and the result is unpredictable, MySQL after 5.7.5 will throw an error.
Upvotes: 0
Reputation: 34231
None. If the ONLY_FULL_GROUP_BY sql mode is not enabled, then MySQL allows
MySQL extension to the standard SQL use of GROUP BY permits the select list, HAVING condition, or ORDER BY list to refer to nonaggregated columns even if the columns are not functionally dependent on GROUP BY columns. This causes MySQL to accept the preceding query. In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate, which is probably not what you want.
This sql mode is enabled by default rom v5.7.5 only.
Upvotes: 0
Reputation: 1269443
First, this is badly formed SQL and you should simply not use it.
But what does it do? It returns a result set with one row per PatientId
. The additional columns specified by the SELECT *
come from indeterminate rows in the data. There is no guarantee that the extra columns even come from the same row.
In practice, the values seem to come from the first row encountered. However, MySQL is quite explicit that you cannot depend on this behavior. In general, you should avoid using aggregation statements that have unaggregated columns in the SELECT
that are not in the GROUP BY
. Other databases do not support this syntax (unless the GROUP BY
keys form a unique/primary key on the data being aggregated).
Upvotes: 3
Reputation: 520888
MySQL doesn't appear to use an aggregation function at all. The records chosen in this case are indeterminate, as the documentation states:
In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate, which is probably not what you want.
But you might be wondering why this feature even exists in the first place. If you are writing a query where you know that all the values in a column be the same, then this feature can possibly save you some work by not having to write a join or subquery to make the GROUP BY
strictly compliant.
Upvotes: 1
Reputation: 77
You have tuo use an aggregate function such as SUM, AVG, COUNT dev.mysql GROUP BY
Upvotes: 0