Riccardo Pirovano
Riccardo Pirovano

Reputation: 13

updating last not blank record SAS

I have the following dataset.

ID    var1    var2    var3 
1     100     200     
1     150             300 

2             120                
2     100     150     200   

3     200             150             
3     250    300                      

I would like to have a new dataset with only the last not blank record for each group of variables.

id    var1    var2    var3
1     150     200     300              
2     100     150     200               
3     250     300     150      

last. select the last reord, but i need to selet the last not null record

Upvotes: 0

Views: 191

Answers (2)

Tom
Tom

Reputation: 51566

Looks like you want the last non missing value for each non-key variable. So you can let the UPDATE statement do the work for you. Normally for update operation you apply transactions to a master dataset. But for your application you can use OBS=0 dataset option to make your current dataset work as both the master and the transactions.

data want ;
  update have(obs=0) have ;
  by id;
run;

Upvotes: 3

Richard
Richard

Reputation: 27498

Riccardo:

There are many ways to select, within each group, the last non-missing value of each column. Here are three ways. I would not say one is the best way, each approach has it's merits depending on the specific data set, coder comfort and long term maintainability.

Way 1 - UPDATE statement

Perhaps the simplest of the coding approaches goes like this:

  • make a data set that has one row per id and the same columns as the original data.
  • use the UPDATE statement to replace each like named variable with a non-missing value.

Example:

data want_base_table(label="One row per id, original columns");
  set have;
  by id;
  if first.id;
run;

* use have as a transaction data set in the update statement;
data want_by_update;
  update want_base_table have;
  by id;
run;

Way 2 - DOW loop

Others will involve arrays and first. and last. flag variables of the BY group. This example shows a DOW loop that tracks the non-missing value and then uses them for the output of each ID:

data want_dow;
  do until (last.id);
    set have;
    by id;

    array myvar var1-var3 ;
    array myhas has1-has3 ;

    do _i = 1 to dim(myvar);
      if not missing (myvar(_i)) then 
        myhas(_i) = myvar(_i);
    end;
  end;

  do _i = 1 to dim(myhas);
    myvar(_i) = myhas(_i);
  end;

  output;

  drop _i has1-has3;
run;

A loop is most often called a DOW loop when there is a SET statement inside the DO; END; block and the loop termination is triggered by the last. flag variable. A similar non DOW approach (not shown) would use the implicit loop and first. to initialize the tracking array and last. for copying the values (tracked within group) into the columns for output.

Way 3 - Merging column slices

data want_by_column_slices;
  merge
    have (keep=id var1 where=(var1 ne .))
    have (keep=id var2 where=(var2 ne .))
    have (keep=id var3 where=(var3 ne .))
  ;
  by id;
  if last.id;
run;

Upvotes: 0

Related Questions