Reputation: 3920
I have the following table called questions:
ID | asker
1 | Bob
2 | Bob
3 | Marley
I want to select each asker only once and if there are multiple askers with the same name, select the one of the highest id. So, the expected results:
ID | asker
3 | Marley
2 | Bob
I use the following query:
SELECT * FROM questions GROUP by questions.asker ORDER by questions.id DESC
I get the following result:
ID | asker
3 | Marley
1 | Bob
It selects the first 'Bob' it encounters instead of the last one.
Upvotes: 44
Views: 131173
Reputation: 61
SELECT * FROM questions GROUP by questions.asker DESC;
worked for me
Upvotes: 0
Reputation: 20473
It's because ORDER BY
is performed AFTER GROUP BY
.
Try this:
SELECT * FROM questions
WHERE id IN
(
SELECT max(id) as id
FROM questions
GROUP by asker
ORDER by id DESC
)
Upvotes: 2
Reputation: 7448
Im writing this answer because @Taryn's first/shorter alternative in accepted answer works only if you are exactly selecting just columns used in GROUP BY and MAX. User asking question is selecting all columns in table (he used SELECT *). So when you add another 3rd column to table, that column value in query result will be incorrect. You will get mixed values from different table rows. @Taryn's second/longer alternative (using inner join and subquery) works but query is uselessly complicated and is 5 times slower in my use case than my simple alternative below.
Consider table questions
:
id | asker
-----------
1 | Bob
2 | Bob
3 | Marley
Query SELECT max(id) as id, asker FROM questions GROUP BY asker ORDER BY id DESC
returns expected:
id | asker
-----------
3 | Marley
2 | Bob
Now consider another table questions
:
id | asker | other
-------------------
1 | Bob | 1st
2 | Bob | 2nd
3 | Marley | 3rd
Query SELECT max(id) as id, asker, other FROM questions GROUP BY asker ORDER BY id DESC
returns unexpected:
id | asker | other
-------------------
3 | Marley | 3rd
2 | Bob | 1st
... note that value of other
for second row of result is incorrect because id=2
comes from second row of table but other=1st
comes from first row of table! That is way many users in comments of Taryn's answer reports that this solution does not work.
Possible simple solution when selecting also another columns is to use GROUP BY
+ DESC
:
SELECT id, asker, other FROM questions GROUP BY asker DESC
id | asker | other
-------------------
3 | Marley | 3rd
2 | Bob | 2nd
(see demo: https://www.db-fiddle.com/f/esww483qFQXbXzJmkHZ8VT/10)
... but this simple solution has some limitations:
asker
in this case (I think it is not problem because you will get better performance since index is suitable in this case. GROUP BY usually needs creation of tmp table but when index is available tmp table will not be created which is faster)SET SESSION sql_mode = '';
) or use ANY_VALUE()
on selected columns which are not aggregated to avoid error ER_WRONG_FIELD_WITH_GROUP.GROUP BY col1 ORDER BY col1 ASC/DESC
:SELECT id, asker, other FROM questions GROUP BY asker ORDER BY asker DESC
id | asker | other
-------------------
3 | Marley | 3rd
2 | Bob | 2nd
(see demo: https://www.db-fiddle.com/f/esww483qFQXbXzJmkHZ8VT/11)
... result is the same as above with GROUP BY ... DESC
(do not forget to use InnoDB and create index).
Upvotes: 6
Reputation: 247680
If you want the last id
for each asker
, then you should use an aggregate function:
SELECT max(id) as id,
asker
FROM questions
GROUP by asker
ORDER by id DESC
The reason why you were getting the unusual result is because MySQL uses an extension to GROUP BY
which allows items in a select list to be nonaggregated and not included in the GROUP BY clause. This however can lead to unexpected results because MySQL can choose the values that are returned. (See MySQL Extensions to GROUP BY)
From the MySQL Docs:
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. ... You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.
Now if you had other columns that you need to return from the table, but don't want to add them to the GROUP BY
due to the inconsistent results that you could get, then you could use a subquery to do so. (Demo)
select
q.Id,
q.asker,
q.other -- add other columns here
from questions q
inner join
(
-- get your values from the group by
SELECT max(id) as id,
asker
FROM questions
GROUP by asker
) m
on q.id = m.id
order by q.id desc
Upvotes: 69
Reputation: 158
To get every column:
SELECT * FROM questions
WHERE id IN
(SELECT max(id) as id, asker
FROM questions
GROUP by asker
ORDER by id DESC)
Improved version of the answer of @bluefeet.
Upvotes: 1
Reputation: 233
Normally MySQL allows group by ascending order records only. So we can order records before grouping.
SELECT * FROM ( SELECT * FROM questions ORDER BY id DESC ) AS questions GROUP BY questions.asker
Upvotes: 21
Reputation: 618
The others are correct about using MAX(ID) to get the results you want. If you're wondering why your query doesn't work, it's because ORDER BY
happens after the GROUP BY
.
Upvotes: 3
Reputation: 263703
The records need to be grouped using GROUP BY
and MAX()
to get the maximum ID for every asker
.
SELECT asker, MAX(ID) ID
FROM TableName
GROUP BY asker
OUTPUT
╔════════╦════╗
║ ASKER ║ ID ║
╠════════╬════╣
║ Bob ║ 2 ║
║ Marley ║ 3 ║
╚════════╩════╝
Upvotes: 5