Reputation: 137
I've looked around on the internet for the solution to this problem but can't seem to find exactly what I am looking for.
lets say that we have a table that looks like this:
_____________________________________________________
| id (PK) | name | date | ... other data ... |
-----------------------------------------------------
| 1 | BOB | 2016-11-20 | ....
--------------------------------
| 3 | CARL | 2015-09-02 | ....
--------------------------------
| 4 | BOB | 2016-11-18 | ....
--------------------------------
| 5 | JON | 2016-03-03 | ....
--------------------------------
| 6 | TIM | 2016-11-24 | ....
--------------------------------
| 7 | TIM | 2016-11-24 | ....
--------------------------------
| 8 | JON | 2016-05-05 | ....
--------------------------------
I need a query that will:
1. return a distinct list of names (distinct function)
2. for each name have the earliest date (min function)
3. id that goes with the above two columns.
4. if the date is the same, use min(id) as a way to decide which row to select
so the desired results in this case would look like this..
_____________________________________________________
| id (PK) | name | date | ... other data ... |
-----------------------------------------------------
| 3 | CARL | 2015-09-02 | ....
--------------------------------
| 4 | BOB | 2016-11-18 | ....
--------------------------------
| 5 | JON | 2016-03-03 | ....
--------------------------------
| 6 | TIM | 2016-11-24 | ....
--------------------------------
Is there a way to do this efficiently in a query without having to create a temporary table and loop over it using script?
This gets me half of the way there:
SELECT distinct(name), min(date), ... other data ... from table1
when I apply the min function to id though it doesn't behave in the way that I need. In the case of BOB it mixes the two rows and returns 1 for 'id' and '2016-11-18' for the date - which is completely incorrect
Any help is much appreciated.
Upvotes: 0
Views: 455
Reputation: 40309
SQL Server 2005 and up support windowing functions, and row_number can be singularly useful. This or something very similar should work.
SELECT
name
,date
,id
from (-- Can't reference the windowing function within it's own select,
-- so we make it a subquery
select
name
,date
,id
,row_number() over (partition by name order by date, id) Ranking
from Table1
) sub
where Ranking = 1
order by name
Upvotes: 1
Reputation: 14341
;WITH cte AS (
SELECT *
,ROW_NUMBER() OVER (PARTITION BY name ORDER BY Date, Id) as RowNum
FROM
@Table
)
SELECT *
FROM
cte
WHERE
RowNum = 1
So if you think of it slightly different. You want the earliest Date row per name. So if you create a partitioned ROW_NUMBER()
on name ordered by Date Ascending and then select where ROW_NUMBER() = 1
you will have your answer. There are ways to do this with the aggregation as well. You can also add id in the order by if you want the first id if the Date is the same as is the case with Tim.
Upvotes: 0