Connection
Connection

Reputation: 349

SQL Group By Issue

SELECT SomeId, ModifiedUser, MAX(ModifiedDate) 
FROM SomeTable 
GROUP BY SomeId, ModifiedUser

Then I get results like this:

1000    User1    Mar 30 2011
1000    User2    Jun 25 2011
1001    User3    Mar 21 2011
1001    User4    Jun 20 2011

How do I modify the query so I get only

1000    User2    Jun 25 2011
1001    User4    Jun 20 2011

Basically take the first query results then group by SomeId with latest ModifiedDate and ModifiedUser connected to the latest ModifiedDate

Upvotes: 3

Views: 161

Answers (3)

Andrew Lazarus
Andrew Lazarus

Reputation: 19310

Windowing functions were invented for this:

SELECT SomeID, ModifiedUser, ModifiedDate FROM
( SELECT SomeID, ModifiedUser, ModifiedDate,
  row_number() over (PARTITION BY SomeID ORDER BY ModifiedDate DESC) AS rn
) AS subqry
WHERE rn=1;

If you want all rows in case of a tie, replace row_number by rank.

You don't mention choice of RDBMS. MySQL does not yet have this capability. Many others do.

Upvotes: 0

Tom H
Tom H

Reputation: 47444

There are several ways to do this. Here's one:

SELECT
    T1.some_id,
    T1.modified_user,
    T1.modified_date
FROM
    Some_Table T1
INNER JOIN (
        SELECT some_id, MAX(modified_date) AS max_modified_date
        FROM Some_Table
        GROUP BY some_id) SQ ON
    SQ.some_id = T1.some_id AND
    SQ.max_modified_date = T1.modified_date

Here's another:

SELECT
    T1.some_id,
    T1.modified_user,
    T1.modified_date
FROM
    Some_Table T1
WHERE
    T1.modified_date = (
                  SELECT MAX(T2.modified_date)
                  FROM Some_Table T2
                  WHERE T2.some_id = T1.some_id)

And another...

SELECT
    T1.some_id,
    T1.modified_user,
    T1.modified_date
FROM
    Some_Table T1
LEFT OUTER JOIN Some_Table T2 ON
    T2.some_id = T1.some_id AND
    T2.modified_date > T1.modified_date
WHERE
    T2.some_id IS NULL

The correlated subquery in the second method often has performance issues, but you can try all three and see which works best for you. Also, consider the possibility of "ties" in your dates. If a tie can happen then you can add in another column or columns that would guarantee uniqueness (such as the columns in the PK) to the first and third approaches.

Upvotes: 1

Tim Almond
Tim Almond

Reputation: 12686

You have to do 2 queries and join them.

1st query:

SELECT SomeId, Max(ModifiedDate) 
FROM SomeTable GROUP BY SomeId

This singles out the "unique information". Then, you join that with:

SELECT SomeId, ModifiedUser, ModifiedDate 
from SomeTable 

to give you all the data. So, the complete query is:

SELECT SomeId, ModifiedUser, ModifiedDate 
FROM SomeTable 
INNER JOIN (
            SELECT SomeId, Max(ModifiedDate) maxmod 
            FROM SomeTable 
            GROUP BY SomeId) uniqueinfo ON SomeTable.SomeId = uniqueinfo.SomeId 
                                       AND ModifiedDate = uniqueinfo.maxmod

Upvotes: 3

Related Questions