bergenga
bergenga

Reputation: 91

How to get SQL row by max of one column, group by another column

I need to read out users from an old database to use for statistics in a new system, but I do not have the original user table. However there is a statistics table with the sum of each year, where I also can find all needed user information. In addition this also only give me the users that were active, which is what I need.

The table has the following relevant columns: (the statistics columns are not relevant here)

I want UserID to be distinct, so it is the only column that I can have in GROUP BY. I will run MAX on Year, to get the values from the most recent year. FirstName, LastName and Email need to be the same as the row where MAX(Year). In other words people may have changed both names and emails during the years, and I only want the last one, as it's the only one that is relevant.

My best suggestion for a SQL query goes like this:

SELECT UserID, Firstname, LastName, Email, MAX(Year) AS Year
FROM myTable
GROUP BY UserID
ORDER BY LastName, FirstName

The only problem is that SQL Server 2008 will not let me do anyhing like that, because all columns either has to be with a function like MAX or a part of GROUP BY. The columns FirstName, LastName and Email can't be under GROUP BY because that will make too many records. It seems somehow to work to put MAX on all of them, but then I have no way of knowing which column the MAX function is actually working on. I do not know for sure that it will be a problem, but I don't have time to look through 100 000 rows to see if there actually is a problem.

So in short I want the whole row of five columns where MAX works on only one column and GROUP BY on another. Does anybody have a good solution, or is it actually safe to use MAX on all non-grouping rows?

Upvotes: 9

Views: 11350

Answers (2)

MatBailie
MatBailie

Reputation: 86706

A few answers...


Correlated sub-query...

SELECT
  *
FROM
  myTable
WHERE
  Year = (SELECT MAX(Year) FROM myTable AS lookup WHERE lookup.UserID = myTable.UserID)


Join on derived aggregate...

SELECT
  *
FROM
  myTable
INNER JOIN
  (SELECT UserID, MAX(Year) AS Year FROM myTable GROUP BY UserID) AS lookup
    ON  lookup.UserID = myTable.UserID
    AND lookup.Year   = myTable.Year


Ordered CTE using ROW_NUMBER()...

WITH
  sequenced_data AS
(
  SELECT
    ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY Year DESC) AS sequence_id,
    *
  FROM
    myTable
)
SELECT
  *
FROM
  sequenced_data
WHERE
  sequence_id = 1

Upvotes: 17

Arvo
Arvo

Reputation: 10570

Dy you have only one year record per user? If yes, then you can use old'n'good join:

SELECT m.UserID, m.Firstname, m.LastName, m.Email, m.Year
FROM myTable m
    INNER JOIN (
        SELECT UserID, MAX(Year) as Year
        FROM myTable
        GROUP BY UserID
    ) x ON x.UserID=m.UserID and x.Year=m.Year
ORDER BY m.LastName, m.FirstName

Sure you can use constructs from newer SQL versions, I just have got used to older (=more generic) possibilities :).

Upvotes: 3

Related Questions