Reputation: 15
here is what I have as a dataset:
****************
* name * value *
* x * # *
* x * . *
* x * . *
* x * . *
* y * . *
* y * # *
* y * . *
* y * . *
* z * . *
* z * . *
* z * # *
* z * . *
* z * . *
* z * # *
****************
What I am trying to do is to get the numbers (#) to be retained until the end of the string of each name. The result would be as follows:
****************
* name * value *
* x * # *
* x * # *
* x * # *
* x * # *
* y * . *
* y * # *
* y * # *
* y * # *
* z * . *
* z * . *
* z * # *
* z * # *
* z * # *
* z * # *
****************
The . stays because there is not data for that point. I just need to complete the strings where I have data.
So far, I had code that looked like this:
DATA test;
SET test;
retain _variable;
if not missing(variable) then _variable=variable;
else variable=_variable;
drop _variable;
RUN;
It does not work because the last # value for x carry over to the first one of y. I thought of using a do until last.variable function. But I was not able to make it work.
Please help.
Upvotes: 0
Views: 1169
Reputation: 9109
You might consider the UPDATE trick. It has the qualities you seek plus it will LOCF all variables.
data value;
input (name value value2)(:$1.);
cards;
x # $
x . @
x . .
x . .
y . $
y # .
y . @
y . .
z . .
z . .
z # $
z . .
z . .
z # @
;;;;
run;
proc print;
run;
data locf;
update value(obs=0) value;
by name;
output;
run;
proc print;
run;
Upvotes: 1
Reputation: 187
Alternatively you could set it missing at the beginning of the data step to avoid needing the OUTPUT statement:
data want(drop = temp);
set have;
by name;
retain temp;
if first.name then temp = .; /* reset */
if value ne . then temp = value;
else if value eq . then value = temp;
run;
Upvotes: 0
Reputation: 12909
You were very close. Instead, use the retain
statement on a new variable called _value
and reset it at the beginning of each name
group. At the end, drop the original value
, then rename _value
back to value
using a data step option.
DATA want(rename=(_value = value) );
SET have;
by name;
retain _value;
if(NOT missing(value) ) then _value = value;
output;
if(last.name) then call missing(_value);
drop value;
RUN;
Upvotes: 0
Reputation: 428
Order your have dataset by name.
Then:
data want;
set have;
by name;
retain new_value;
if value ne . then new_value = value;
output;
if last.name then new_value = .;
run;
After the output, we check if it's the last name, if it is, we set new_value to missing, so in the next input from the set statement we can get the number, if there is one.
Upvotes: 0