Victor
Victor

Reputation: 17107

How to extract a SAS record which has the max value for a column

I have a SAS dataset like this:

col1   col2   col3   col4    col5  col6
A1     B1     C1     D1      E1    $100
A1     B1     C1     D2      E2    $200
A2     B2     C2     D3      E3    $500

The first 3 columns are my key columns. I need to extract the row that has got the highest value for col6.

So I can do:

proc sql;
   create table temp as 
   select col1,col2,col3,max(col6) as col6 
   from dataset 
   group by 1,2,3;
   select * from dataset t1 
   inner join temp t2 
   where t1.col1 = t2.col1 and t1.col2 = t2.col2 
     and t1.col3 = t2.col3 and t1.col6 = t2.col6;
quit;

But how can I achieve the same by making one pass through the data? Is there a way?

Upvotes: 0

Views: 2281

Answers (2)

Data Null
Data Null

Reputation: 26

data col;
    input (col1-col5)(:$2.)  col6:comma.;
    cards;
A1     B1     C1     D1      E1     $100
A1     B1     C1     D2      E2     $200
A2     B2     C2     D3      E3      $500
;;;;
    run;
proc print;
    run;
proc summary data=col;
    output out=maxrow idgroup(max(col6) out(_all_)=);
    run;
proc print;
    run;

Upvotes: 0

Jeff
Jeff

Reputation: 1807

Your approach is perfectly fine for many uses. If using exactly one pass is actually essential, you could use a data step and a hash object. This reads every record once and updates the single row in the hash object every time it finds a row with a higher col6 than it's yet seen.

data _null_;
    if 0 then set have; /*Make sure all vars used in the hash object are set up with the correct types*/
    retain highest_so_far;
    if _n_ = 1 then do;
        highest_so_far = col6;
        declare hash hi_row();
        hi_row.definekey(co1,col2,col3,col4,col5,col6);
        hi_row.definedone();
    end;

    set have end=eof;

    if col6 > highest_so_far then do;
        hi_row.clear();
        hi_row.add();
        highest_so_far = col6;
    end;

    if (eof) then hi_row.output(want);
run;

If there's a tie for highest, this program would return the first, but it could be modified to return an arbirary number of ties.

Upvotes: 5

Related Questions