Sam
Sam

Reputation: 33

SQL Server - Select most recent records with condition

I have a table like this.

Table :

ID  EnrollDate  ExitDate
1   4/1/16      8/30/16 
2   1/1/16      null
2   1/1/16      7/3/16
3   2/1/16      8/1/16
3   2/1/16      9/1/16
4   1/1/16      12/12/16
4   1/1/16      12/12/16
4   1/1/16      12/12/16
4   1/1/16      null
5   5/1/16      11/12/16
5   5/1/16      11/12/16
5   5/1/16      11/12/16

Need to select the most recent records with these conditions.

  1. One and only one record has the most recent enroll date - select that
  2. Two or more share same most recent enroll date and one and only one record has either a NULL Exit Date or the most recent Exit Date - Select the record with null. If no null record pick the record with recent exit date
  3. Two or more with same enroll and Exit Date - If this case exists, don't select those record

So the expected result for the above table should be :

ID  EnrollDate  ExitDate
1   4/1/16      8/30/16 
2   1/1/16      null
3   2/1/16      9/1/16
4   1/1/16      null

I wrote the query with group by. I am not sure how to select with the conditions 2 and 3.

select t1.*  from table t1
INNER JOIN(SELECT Id,MAX(EnrollDate) maxentrydate
FROM table
GROUP BY Id)t2 ON EnrollDate = t2.maxentrydate and t1.Id=t2.Id 

Please let me know what is the best way to do this.

Upvotes: 3

Views: 4986

Answers (3)

sstan
sstan

Reputation: 36473

Using the rank() window function, I think it's possible.

This is untested, but it should work:

select t.ID, t.EnrollDate, t.ExitDate
  from (select t.*,
               rank() over(
                 partition by ID 
                 order by EnrollDate desc,
                          case when ExitDate is null then 1 else 2 end,
                          ExitDate desc) as rnk
          from tbl t) t
 where t.rnk = 1
 group by t.ID, t.EnrollDate, t.ExitDate
having count(*) = 1

The basic idea is that the rank() window function will rank the most "recent" rows with a value of 1, which we filter on in the outer query's where clause.

If more than one row have the same "most recent" data, they will all share the same rank of 1, but will get filtered out by the having count(*) = 1 clause.

Upvotes: 4

dcieslak
dcieslak

Reputation: 2715

with B as (
select id, enrolldate , 
  exitdate,
  row_number() over (partition by id order by enrolldate desc, case when exitdate is null then 0 else 1 end, exitdate desc) rn
 from ab )
 select b1.id, b1.enrolldate, b1.exitdate from b  b1
 left join b b2
 on b1.rn = b2.rn -1 and
    b1.id = b2.id and
    b1.exitdate = b2.exitdate and
    b1.enrolldate = b2.enrolldate
where b1.rn = 1 and 
     b2.id is nULL

The left join is used to fullfill the 3) requirement. When record is returned then we don't want it.

Upvotes: 3

Felix Pamittan
Felix Pamittan

Reputation: 31879

Use ROW_NUMBER coupled with CASE expression to achieve the desired result:

WITH Cte AS(
    SELECT t.*,
        ROW_NUMBER() OVER(
            PARTITION BY t.ID
            ORDER BY
                t.EnrollDate DESC,
                CASE WHEN t.ExitDate IS NULL THEN 0 ELSE 1 END,
                t.ExitDate DESC
        ) AS rn
    FROM Tbl t
    INNER JOIN (
        SELECT 
            ID, 
            COUNT(DISTINCT CHECKSUM(EnrollDate, ExitDate)) AS DistinctCnt, -- Count distinct combination of EnrollDate and ExitDate per ID
            COUNT(*) AS RowCnt -- Count number of rows per ID
        FROM Tbl
        GROUP BY ID
    ) a
        ON t.ID = a.ID
    WHERE 
        (a.DistinctCnt = 1 AND a.RowCnt = 1)
        OR a.DistinctCnt  > 1
)
SELECT 
    ID, EnrollDate, ExitDate
FROM Cte c 
WHERE Rn = 1
  • The ORDER BY clause in the ROW_NUMBER takes care of conditions 2 and 3.
  • The INNER JOIN and the WHERE clause take care of 1 and 4.

ONLINE DEMO

Upvotes: 4

Related Questions