Reputation: 75
I hope this is not a duplicate question. I've searched the forum and retain function seems to be choice of weapon but it copies down an observation, and I'm trying to do the following; for a given id, copy the second line to the first line for the x value. also first value of x is always 2.
Here's my data;
id x
3 2
3 1
3 1
2 2
2 1
2 1
6 2
6 0
6 0
and i want it to look like this;
id x
3 1
3 1
3 1
2 1
2 1
2 1
6 0
6 0
6 0
and here's the starter code;
data have;
input id x;
cards;
3 2
3 1
3 1
2 2
2 1
2 1
6 2
6 0
6 0
;
run;
Upvotes: 1
Views: 1576
Reputation: 63424
If your data is exactly as you say, then you can use a lookahead merge. It literally takes the dataset and merges itself to a copy of the dataset that starts on row 2, side-to-side. You just have to check that you're still on the same ID. This does change the value of x for all records to the value one hence, not just the first; you could add additional code to pay attention to that (but can't use FIRST and LAST).
data want;
merge have have(firstobs=2 rename=(id=newid x=newx));
if newid=id then x=newx;
keep x id;
run;
If you don't have any additional variables of interest, then you can do something even more interesting: duplicate the second row in its entirety and delete the first row.
data want;
set have;
by id notsorted;
if first.id then do;
firstrow+1;
delete;
end;
if firstrow=1 then do;
firstrow=0;
output;
end;
output;
run;
However, the "safest" method (in terms of doing most likely what you want precisely) is the following, which is a DoW loop.
data want;
idcounter=0;
do _n_ = 1 by 1 until (last.id);
set have;
by id notsorted;
idcounter+1;
if idcounter=2 then second_x = x;
end;
do _n_=1 by 1 until (last.id);
set have;
by id notsorted;
if first.id then x=second_x;
output;
end;
run;
This identifies the second x in the first loop, for that BY group, then in the second loop sets it to the correct value for row 1 and outputs.
In both of the latter examples I assume your data is organized by ID but not truly sorted (like your initial example is). If it's not organized by ID, you need to perform a sort first (but then can remove the NOTSORTED).
Upvotes: 0
Reputation: 1985
Lead is tricky in SAS. You can sort in reverse and use a lag function to get around it though, and you are right: a retain statement will allow us to add an order variable so we can sort it back to its original format.
data have;
set have;
retain order;
lagid = lag(id);
if id ne lagid then order = 0;
order = order + 1;
drop lagid;
run;
proc sort data=have; by id descending order; run;
data have;
set have;
leadx = lag(x);
run;
proc sort data=have; by id order; run;
data have;
set have;
if order = 3 then x_fixed = x;
else x_fixed = leadx;
run;
Upvotes: 2