Sam
Sam

Reputation: 37

Extract specific rows from SAS dataset based on a particular cell value of a variable

I want to extract specific set of rows from a large SAS dataset based on a particular cell value of a variable into a new dataset. In this dataset, I have 6 variables. Following is an example of this dataset:

Variable names:   Var1      Var2      Var3   Var4   Var5  Var6
Row 1              A          1        2     3       4     5
Row 2              B          1        2     3       4     5
Row 3              A          1        2     3       4     5
Row 4              B          1        2     3       4     5
Row 5              Sample     1        2     3       4     5
Row 6              A          1        2     3       4     5
Row 7              B          1        2     3       4     5
Row 8              A          1        2     3       4     5
Row 9              B          1        2     3       4     5
Row 10             A          1        2     3       4     5
Row 11             B          1        2     3       4     5
Row 12             A          1        2     3       4     5
Row 13             B          1        2     3       4     5

From this dataset, I want to select a set of next 8 rows starting from a row in which Var 1 has a value = "Sample". I want to extract multiple such sets of 8 rows from this dataset into a new dataset. Can someone please guide me how I can accomplish this in SAS?

Thank you

Upvotes: 1

Views: 3804

Answers (3)

Reeza
Reeza

Reputation: 21294

You can set a counter and output as desired, use the RETAIN coupled with an IF (& OUTPUT) statement. You may need to tweak the IF condition but I think you get the idea here.

data want;
set have;
retain counter 10; 
if strip(Var1) = "Sample" then counter=1;
else counter+1;
if 2<=counter<=9 then OUTPUT;
*if 2<=counter<=9; *this is the same as above, but less code;
run;

Upvotes: 0

Quentin
Quentin

Reputation: 6378

One way to do this is to set a counter to 8 whenever the previous record has var1="Sample", and then decrement the counter for each record. And only output records where counter is >= 1.

data want ;
  set have ;

  if lag(var1) = "Sample" then counter = 8 ;
  else counter+(-1) ;  *counter is implicitly retained ;

  if counter>=1 then output ;
  * drop counter ;
run ;

Upvotes: 1

Hugs
Hugs

Reputation: 543

Would the output statement work for you?

data have;
    infile datalines dsd dlm=",";
    input   Variable_names  : $char10.
            Var1            : $char10.
            Var2            : 8.
            Var3            : 8.
            Var4            : 8.
            Var5            : 8.
            Var6            : 8.;
    datalines;
Row 1 , A     , 1, 2, 3, 4, 5
Row 2 , B     , 1, 2, 3, 4, 5
Row 3 , A     , 1, 2, 3, 4, 5
Row 4 , B     , 1, 2, 3, 4, 5
Row 5 , Sample, 1, 2, 3, 4, 5
Row 6 , A     , 1, 2, 3, 4, 5
Row 7 , B     , 1, 2, 3, 4, 5
Row 8 , A     , 1, 2, 3, 4, 5
Row 9 , B     , 1, 2, 3, 4, 5
Row 10, A     , 1, 2, 3, 4, 5
Row 11, B     , 1, 2, 3, 4, 5
Row 12, A     , 1, 2, 3, 4, 5
Row 13, B     , 1, 2, 3, 4, 5
;
run;

data want_without
     want_with;
    set have;
    if strip(Var1) = "Sample" then output want_with;
    else                           output want_without;
run;

Upvotes: 1

Related Questions