Reputation: 191
I seek to add a new column which indicates the last modification date of another column. In my case, I have projects which are initiated at a specific time(e.g. 5/1/19) with a specific status. In some instances a project may change its status (cancelled or stopped). Each morning a snapshot is taken, which in this case could be used to track the last modification date.
There are different projects (ID, Name) and different types of status.
Currently:
project_ID Name Status Date
1 ABC Cancelled 1/4/20
1 ABC Cancelled 1/3/20
1 ABC Continued 1/2/20
1 ABC Continued 1/1/20
.. ... ......... ......
1 ABC Continued 5/1/19
I'd like to achieve the following table:
project_ID Name Status Date LastModified
1 ABC Cancelled 1/4/20 1/3/20
1 ABC Cancelled 1/3/20 1/3/20
1 ABC Continued 1/2/20 5/1/19
1 ABC Continued 1/1/20 5/1/19
.. ... ......... ...... ......
1 ABC Continued 5/1/19 5/1/19
Upvotes: 0
Views: 863
Reputation: 191
I was able to solve the issue by using the Query suggested by @Philipp Johannis.
Here I used MIN instead of MAX:
SELECT project_id,
status,
date,
MIN(date)OVER(PARTITION BY project_id, status) AS LastModified
FROM TableABC
ORDER BY date DESC
I couldn't get the other two answers to run properly and this solutions seems to be easier to read and understand, which is why I'll highlight this one as solution.
Upvotes: 0
Reputation: 1269823
You can also approach this with lag()
and a cumulative max:
select t.*,
max(date) filter (where prev_status is distinct from status) over (partition by project_id, name order by date) as last_change_date
from (select t.*,
lag(status) over (partition by project_id, name order by date) as prev_status
from t
) t;
The subquery calculates the previous status, to identify any change. The outer query then takes the maximum of the date where a change in status is detected.
One advantage (or perhaps weakness?) of this approach is that it is robust if any of the snapshot dates are missing. Any such gaps are simply ignored.
Upvotes: 2
Reputation: 222482
This is a kind of gaps-and-island problem, where you want to identify the start of each island. The fact that you have one record per day simplifies the solution a little: I would use row_number()
and date arithmetics to define groups of adjacent records, and then a window min to get the first date of each group.
select t.*,
min(date) over(partition by project_id, status, date - rn * interval '1 day') last_modified
from (
select t.*, row_number() over(partition by project_id, status order by date) rn
from mytable t
) t
order by project_id, date
Upvotes: 1