Reputation: 84
I have a table structure and data below.
I need to remove duplicate record from the table list. My confusion is that when I am firing query
SELECT * FROM `table` GROUP BY CONCAT(`name`,department)
then giving me correct list(12 records).
Same query when I am using the subquery:
SELECT *
FROM `table` WHERE id IN (SELECT id FROM `table` GROUP BY CONCAT(`name`,department))
It returning all record which is wrong.
So, My question is why group by
in subquery is not woking.
Upvotes: 1
Views: 491
Reputation: 24002
You can JOIN
the grouped ids with that of table ids, so that you can get desired results.
Example:
SELECT t.* FROM so_q32175332 t
JOIN ( SELECT id FROM so_q32175332
GROUP BY CONCAT( name, department ) ) f
ON t.id = f.id
ORDER BY CONCAT( name, department );
Here order by
was added just to compare directly the *
results on group
.
Demo on SQL Fiddle: http://sqlfiddle.com/#!9/d715a/1
Upvotes: 1
Reputation: 6844
Actually as Tim mentioned in his answer that it to get first unique record by group by clause is not a standard feature of sql but mysql allows it till mysql5.6.16 version but from 5.6.21 it has been changed.
Just change mysql version in your sql fiddle and check that you will get what you want.
Upvotes: 1
Reputation: 521259
In the query
SELECT * FROM `table` GROUP BY CONCAT(`name`,department)
You are selecting the id
column, which is a non-aggregate column. Many RDBMS would give you an error, but MySQL allows this for performance reasons. This means MySQL has to choose which record to retain in the result set. Based on the result set in your original problem, it appears that MySQL is retaining the id
of the first duplicate record, in cases where a group has more than one member.
In the query
SELECT *
FROM `table`
WHERE id IN
(
SELECT id FROM `table` GROUP BY CONCAT(`name`,department)
)
you are also selecting a non-aggregate column in the subquery. It appears that MySQL actually decides which id
value to be retained in the subquery based on the id
value in the outer query. That is, for each id
value in table
, MySQL performs the subquery and then selectively chooses to retain a record in the group if two id
values match.
You should avoid using a non-aggregate column in a query with GROUP BY
, because it is a violation of the ANSI standard, and as you have seen here it can result in unexpected results. If you give us more information about what result set you want, we can give you a correct query which will avoid this problem.
I welcome anyone who has documentation to support these observations to either edit my question or post a new one.
Upvotes: 1