Query that uses min aggregate function on multiple columns with priority

Question

I've looked around on the internet for the solution to this problem but can't seem to find exactly what I am looking for.

lets say that we have a table that looks like this:

_____________________________________________________
| id (PK) | name  |  date      | ... other data ... |
-----------------------------------------------------
|    1    |  BOB  | 2016-11-20 |      ....
--------------------------------     
|    3    |  CARL | 2015-09-02 |      ....
--------------------------------
|    4    |  BOB  | 2016-11-18 |      ....
--------------------------------
|    5    |  JON  | 2016-03-03 |      ....
--------------------------------
|    6    |  TIM  | 2016-11-24 |      ....
--------------------------------
|    7    |  TIM  | 2016-11-24 |      ....
--------------------------------
|    8    |  JON  | 2016-05-05 |      ....
--------------------------------

I need a query that will:
1. return a distinct list of names (distinct function)
2. for each name have the earliest date (min function)
3. id that goes with the above two columns.
4. if the date is the same, use min(id) as a way to decide which row to select

so the desired results in this case would look like this..

_____________________________________________________ 
| id (PK) | name  |    date    | ... other data ... |
-----------------------------------------------------
|    3    |  CARL | 2015-09-02 |      ....
--------------------------------
|    4    |  BOB  | 2016-11-18 |      ....
--------------------------------
|    5    |  JON  | 2016-03-03 |      ....
--------------------------------
|    6    |  TIM  | 2016-11-24 |      ....
--------------------------------

Is there a way to do this efficiently in a query without having to create a temporary table and loop over it using script?

This gets me half of the way there:
SELECT distinct(name), min(date), ... other data ... from table1

when I apply the min function to id though it doesn't behave in the way that I need. In the case of BOB it mixes the two rows and returns 1 for 'id' and '2016-11-18' for the date - which is completely incorrect

Any help is much appreciated.

Matt · Accepted Answer

;WITH cte AS (
    SELECT *
       ,ROW_NUMBER() OVER (PARTITION BY name ORDER BY Date, Id) as RowNum
    FROM
       @Table
)

SELECT *
FROM
    cte
WHERE
    RowNum = 1

So if you think of it slightly different. You want the earliest Date row per name. So if you create a partitioned ROW_NUMBER() on name ordered by Date Ascending and then select where ROW_NUMBER() = 1 you will have your answer. There are ways to do this with the aggregation as well. You can also add id in the order by if you want the first id if the Date is the same as is the case with Tim.

Query that uses min aggregate function on multiple columns with priority

Answers (2)

Related Questions