kstats9pt3
kstats9pt3

Reputation: 873

SAS proc sql find min/max only among missing/non-missing within group

I have multiple dates per IID and I'm looking to find the max (maybe min down the road) within the group only if another variable is missing. Take the following test data:

data test;
    informat EFF_DATE END_DATE date9.;
    length IID $12 EFF_DATE END_DATE 8;
    input IID $ EFF_DATE END_DATE;
    format EFF_DATE END_DATE date9.;
    datalines;
    A   11MAY15 .   
    A   12SEP13 01JAN15
    A   13DEC12 .
    B   25AUG14 .   
    B   13JUN17 01JAN15
    B   03NOV13 .
    ;
run;

My plan is to build off the following code to calculate the max EFF_DATE by IID, but only if END_DATE is missing which I cannot figure out how to add this logic in sql.

proc sql noprint;
    create table test_fill as
    select *,
        max(EFF_DATE) /* add something here? */ as MOST_RECENT_EFF_DATE format=date9.
    from test group by IID;
quit;

The following shows the desired output:

IID EFF_DATE    END_DATE    MOST_RECENT_EFF_DATE
A   11MAY15     .           11MAY15
A   12SEP13     01JAN15     11MAY15
A   13DEC12     .           11MAY15
B   25AUG14     .           25AUG14
B   13JUN17     01JAN15     25AUG14
B   03NOV13     .           25AUG14

For IID=B, even though 13JUN17 is the max date we cannot accept it because END_DATE is non-missing so I need to take the next highest date. I'm sure I could write a bunch of data steps and merge back but I'd like to try to knock it out in sql if possible.

Upvotes: 1

Views: 2598

Answers (1)

Joe
Joe

Reputation: 63424

You include the condition as a case inside the max function, with else being null (which is not max ever, nor is it min).

proc sql noprint;
    create table test_fill as
    select *,
        max(case when missing(end_Date) then EFF_DATE else . end) as MOST_RECENT_EFF_DATE format=date9.
    from test group by IID;
quit;

Upvotes: 4

Related Questions