Jif112
Jif112

Reputation: 3

Select rows with the most recent two dates based on value changes

I have a table

Date Job_Id Employee_ID Status
2021-01-01 Teacher 10 Active
2021-02-01 Teacher 10 Active
2021-03-01 Teacher 10 Retired
2021-04-01 Teacher 10 Active
2021-01-01 Barber 11 Active
2021-02-01 Barber 11 Suspended
2021-03-01 Barber 11 Active
2021-04-01 Barber 11 Active

and I would like a query to return this result set

Date Job_Id Employee_ID Status
2021-03-01 Teacher 10 Retired
2021-04-01 Teacher 10 Active
2021-02-01 Barber 11 Suspended
2021-03-01 Barber 11 Active

Essentially, it should return the two most recent records when the status changed. I'm currently experimenting with a lag function but I'm not too proficient at SQL yet and here's where I'm stuck.

SELECT Date, Job_ID,Employee_ID, [Status], PrevStatus FROM 
    (
    SELECT 
Date,
    Job_ID,
      Employee_ID,
      [Status],
      LAG([Status]) OVER (PARTITION BY Job_ID ORDER BY ExtractDate ) AS PrevStatus
FROM    #Statuses
) AS a
WHERE a.PrevStatus <> [Status] AND a.PrevStatus IS NOT NULL

Would anyone have any ideas on how to modify this?

Upvotes: 0

Views: 753

Answers (3)

Yohann V.
Yohann V.

Reputation: 56

Maybe, I don't understand the use case but why do you want to use LAG?

I would do something like this?

SELECT MAX(Date), Job_Id, Employee_ID, Status
FROM #Statuses
GROUP BY Job_ID, Employee_ID, Status

Upvotes: 0

Squirrel
Squirrel

Reputation: 24763

Looks like you wanted is the same as the last 2 unique status for each employee

SELECT *
FROM
(
    select *, 
           rn = row_number() over (partition by Employee_ID, [Status]
                                       order by [Date] desc)
    from   #Statuses
) d
WHERE  d.rn = 1
order by Employee_ID, [Date]

Upvotes: 1

sticky bit
sticky bit

Reputation: 37472

Hmm, I guess you need lag() and lead() to get all rows where the status in the next or the previous row changes. You then need to use row_number() on that result to get the last two.

SELECT date,
       job_id,
       employee_id,
       status
       FROM (SELECT date,
                    job_id,
                    employee_id,
                    status,
                    row_number() OVER (PARTITION BY employee_id,
                                                    job_id
                                       ORDER BY date DESC) row_number
                    FROM (SELECT date,
                                 job_id,
                                 employee_id,
                                 status,
                                 lag(status, 1, status) OVER (PARTITION BY employee_id,
                                                                           job_id
                                                              ORDER BY date ASC) prevstatus,
                                 lead(status, 1, status) OVER (PARTITION BY employee_id,
                                                                            job_id
                                                               ORDER BY date ASC) nextstatus
                                 FROM elbat) x
                    WHERE status <> prevstatus
                           OR status <> nextstatus) y
            WHERE row_number <= 2
            ORDER BY employee_id ASC,
                     date ASC;

db<>fiddle

Upvotes: 1

Related Questions