Reputation: 2342
I have table in SAS Enterprise Guide like below:
Data type:
Values are not sorted in original table, but it could be if it is important for solution
ID | DT | OF_1 | OF_2 |
---|---|---|---|
111 | 01MAY2021 | PR | PR |
111 | 05MAY2021 | PR | P |
111 | 10MAY2021 | P | P |
222 | 30MAY2021 | PR | M |
222 | 01SEP2022 | M | PR |
333 | 05SEP2021 | PR | P |
333 | 07SEP2021 | P | PR |
333 | 11SEP2021 | PR | M |
444 | 15AUG2022 | M | M |
And I need to create numeric column "RESULT" where:
30SEP2022'd - "DT" of change PR to P or M
("DT" of return to PR)
- ("DT" of change PR to P or M)
So as a result I need something like below (in DT column is date of the longest changed):
ID | DT | RESULT |
---|---|---|
111 | 05MAY2021 | 513-> changed PR to P or M in 05MAY2021 and did not return to PR so '30SEP2022'd -'05MAY2021'd = 513 |
222 | 30MAY2021 | 459 -> changed PR to P or M in 30MAY2021 and reutrn to PR in 01SEP2021 so '01SEP2021'd - '30MAY2021'd = 459 |
333 | 11SEP2021 | 384 -> changed PR to P or M 2 times: 05SEP2021 and 11SEP2021, but for the first time he lasted only 2 days because he returned to PR after 2 days, for the second time he lasted: '30SEP2022'd - '11SEP2021'd = 384 , and for this ID max is 384 |
444 | - | 0 -> because he have never changed PR to P or M |
How can I do that in SAS Enterprise Guide normal SAS code or in PROC SQL ?
Upvotes: 0
Views: 299
Reputation: 7769
proc sort data=have ; by ID DT ; run ; data want ; set have ; by ID DT ; retain DAT_PR2PM MAX_PM_DAYS MAX_PM_DATE . ; if first.ID then call missing(DAT_PR2PM, MAX_PM_DAYS, MAX_PM_DATE) ; /* store date of PR > P/M */ if OF_1 = 'PR' and OF_2 in('P','M') then DAT_PR2PM = DT ; /* P/M > PR, calculate days / retain max */ if OF_2 = 'PR' and OF_1 in('P','M') and not missing(DAT_PR2PM) then do ; DAYS = sum(DT, -DAT_PR2PM) ; MAX_PM_DAYS = max(DAYS, MAX_PM_DAYS) ; if DAYS = MAX_PM_DAYS then MAX_PM_DATE = DT ; call missing(DAT_PR2PM) ; /* set back to missing */ end ; /* last record for ID, seen PR > P/M but no P/M > PR */ if last.ID and not missing(DAT_PR2PM) then do ; DAYS = sum(date(), -DAT_PR2PM) ; MAX_PM_DAYS = max(DAYS, MAX_PM_DAYS) ; if DAYS = MAX_PM_DAYS then MAX_PM_DATE = DT ; end ; /* output result for ID */ if last.ID ; keep ID MAX_PM_DAYS MAX_PM_DATE ; run ;
Upvotes: 0