Gokay
Gokay

Reputation: 75

Copy a value from a line below in SAS

I hope this is not a duplicate question. I've searched the forum and retain function seems to be choice of weapon but it copies down an observation, and I'm trying to do the following; for a given id, copy the second line to the first line for the x value. also first value of x is always 2.

Here's my data;

id    x
3   2 
3   1
3   1
2   2
2   1
2   1
6   2
6   0
6   0

and i want it to look like this;

id    x
3   1 
3   1
3   1
2   1
2   1
2   1
6   0
6   0
6   0

and here's the starter code;

data have;
input id x;
cards;
3   2 
3   1
3   1
2   2
2   1
2   1
6   2
6   0
6   0
;
run;

Upvotes: 1

Views: 1576

Answers (2)

Joe
Joe

Reputation: 63424

If your data is exactly as you say, then you can use a lookahead merge. It literally takes the dataset and merges itself to a copy of the dataset that starts on row 2, side-to-side. You just have to check that you're still on the same ID. This does change the value of x for all records to the value one hence, not just the first; you could add additional code to pay attention to that (but can't use FIRST and LAST).

data want;
  merge have have(firstobs=2 rename=(id=newid x=newx));
  if newid=id then x=newx;
  keep x id;
run;

If you don't have any additional variables of interest, then you can do something even more interesting: duplicate the second row in its entirety and delete the first row.

data want;
  set have;
  by id notsorted;
  if first.id then do;
    firstrow+1;
    delete;
  end;
  if firstrow=1 then do;
    firstrow=0;
    output;
  end;
  output;
run;

However, the "safest" method (in terms of doing most likely what you want precisely) is the following, which is a DoW loop.

data want;
  idcounter=0;
  do _n_ = 1 by 1 until (last.id);
    set have;
    by id notsorted;
    idcounter+1;
    if idcounter=2 then second_x = x;
  end;
  do _n_=1 by 1 until (last.id);
    set have;
    by id notsorted;
    if first.id then x=second_x;
    output;
  end;
run;

This identifies the second x in the first loop, for that BY group, then in the second loop sets it to the correct value for row 1 and outputs.

In both of the latter examples I assume your data is organized by ID but not truly sorted (like your initial example is). If it's not organized by ID, you need to perform a sort first (but then can remove the NOTSORTED).

Upvotes: 0

JJFord3
JJFord3

Reputation: 1985

Lead is tricky in SAS. You can sort in reverse and use a lag function to get around it though, and you are right: a retain statement will allow us to add an order variable so we can sort it back to its original format.

data have;
set have;
retain order;
lagid = lag(id);
if id ne lagid then order = 0;
order = order + 1;
drop lagid;
run;

proc sort data=have; by id descending order; run;

data have;
set have;
leadx = lag(x);
run;

proc sort data=have; by id order; run;

data have;
set have;
if order = 3 then x_fixed = x;
else x_fixed = leadx;
run;

Upvotes: 2

Related Questions