gbarbe77
gbarbe77

Reputation: 15

Retain statement in SAS

here is what I have as a dataset:

****************
* name * value *
*  x   *   #   *
*  x   *   .   *
*  x   *   .   *
*  x   *   .   *
*  y   *   .   *
*  y   *   #   *
*  y   *   .   *
*  y   *   .   *
*  z   *   .   *
*  z   *   .   *
*  z   *   #   *
*  z   *   .   *
*  z   *   .   *
*  z   *   #   *
****************

What I am trying to do is to get the numbers (#) to be retained until the end of the string of each name. The result would be as follows:

****************
* name * value *
*  x   *   #   *
*  x   *   #   *
*  x   *   #   *
*  x   *   #   *
*  y   *   .   *
*  y   *   #   *
*  y   *   #   *
*  y   *   #   *
*  z   *   .   *
*  z   *   .   *
*  z   *   #   *
*  z   *   #   *
*  z   *   #   *
*  z   *   #   *
****************

The . stays because there is not data for that point. I just need to complete the strings where I have data.

So far, I had code that looked like this:

DATA test;
  SET test;
  retain _variable;
  if not missing(variable) then _variable=variable;
  else variable=_variable;
  drop _variable;
RUN;

It does not work because the last # value for x carry over to the first one of y. I thought of using a do until last.variable function. But I was not able to make it work.

Please help.

Upvotes: 0

Views: 1169

Answers (4)

data _null_
data _null_

Reputation: 9109

You might consider the UPDATE trick. It has the qualities you seek plus it will LOCF all variables.

data value;
   input (name value value2)(:$1.);
   cards;
 x      #     $
 x      .     @ 
 x      .     . 
 x      .     . 
 y      .     $
 y      #     .
 y      .     @
 y      .     .
 z      .     .
 z      .     .
 z      #     $
 z      .     .
 z      .     .
 z      #     @
 ;;;;
   run;
proc print;
   run;
data locf;
   update value(obs=0) value;
   by name;
   output;
   run; 
proc print;
   run;

enter image description here

Upvotes: 1

user5072412
user5072412

Reputation: 187

Alternatively you could set it missing at the beginning of the data step to avoid needing the OUTPUT statement:

data want(drop = temp);
set have;
by name;
retain temp;

if first.name then temp = .; /* reset */

if value ne . then temp = value;
else if value eq . then value = temp;

run;

Upvotes: 0

Stu Sztukowski
Stu Sztukowski

Reputation: 12909

You were very close. Instead, use the retain statement on a new variable called _value and reset it at the beginning of each name group. At the end, drop the original value, then rename _value back to value using a data step option.

DATA want(rename=(_value = value) );
  SET have;
  by name;

  retain _value;

  if(NOT missing(value) ) then _value = value;

  output;

  if(last.name) then call missing(_value);

  drop value;
RUN;

Upvotes: 0

RamB
RamB

Reputation: 428

Order your have dataset by name.

Then:

data want;
set have;
by name;
retain new_value;

if value ne . then new_value = value;
output;
if last.name then new_value = .;
run;

After the output, we check if it's the last name, if it is, we set new_value to missing, so in the next input from the set statement we can get the number, if there is one.

Upvotes: 0

Related Questions