Insu Q
Insu Q

Reputation: 423

In SAS, how do you stop flagging a group of rows if a specific condition is met?

I have a table in SAS dataset that looks like this:

proc sql;
create table my_table
    (id char(1),
    my_date num format=date9.,
    my_col num);
insert into my_table
    values('A','01JAN2010'd,.)
    values('A','02JAN2010'd,0)
    values('A','03DEC2009'd,1)
    values('A','04NOV2009'd,1)
    values('B','01JAN2010'd,.)
    values('B','02NOV2009'd,2)
    values('C','01JAN2010'd,.)
    values('C','02OCT2009'd,3)
    values('D','01JAN2010'd,.)
    values('D','02NOV2009'd,2)
    values('D','03OCT2009'd,1)
    values('D','04AUG2009'd,2)
    values('D','05MAY2009'd,3)
    values('D','06APR2009'd,1);
quit;

I am trying to create a new column desired that, for each group of id column, flags the row with a value of 1 if the value in my_col is missing or less than 3.

The part I'm having trouble with is that when there is a my_col value that is greater than 2, I need the desired value for that row to be missing and also stop flagging any remaining rows in the id group with a value of 1.

The resulting dataset should look like this:

+----+-----------+--------+---------+
| id | my_date   | my_col | desired |
+----+-----------+--------+---------+
| A  | 01JAN2010 | .      | 1       |
| A  | 02JAN2010 | 0      | 1       |
| A  | 03DEC2009 | 1      | 1       |
| A  | 04NOV2009 | 1      | 1       |
| B  | 01JAN2009 | .      | 1       |
| B  | 02NOV2009 | 2      | 1       |
| C  | 01JAN2010 | .      | 1       |
| C  | 02OCT2009 | 3      | .       |
| D  | 01JAN2010 | .      | 1       |
| D  | 02NOV2009 | 2      | 1       |
| D  | 03OCT2009 | 1      | 1       |
| D  | 04AUG2009 | 2      | 1       |
| D  | 05MAY2009 | 3      | .       |
| D  | 06APR2009 | 1      | .       |
+----+-----------+--------+---------+

Upvotes: 1

Views: 269

Answers (2)

Tom
Tom

Reputation: 51566

Looks like a simple application of a retained variable. Set the flag to 1 when you start a new group and then set it to missing when the value of MY_COL is larger than 2.

data want;
  set my_table ;
  by id;
  if first.id then desired=1;
  if my_col>2 then desired=.;
  retain desired;
run;

Also it is not clear why you used such complicated code to create your example data. Why not a simple data step?

data my_table;
  input id :$1. my_date :date. my_col;
  format my_date date9.;
cards;
A 01JAN2010 .
A 02JAN2010 0
A 03DEC2009 1
A 04NOV2009 1
B 01JAN2010 .
B 02NOV2009 2
C 01JAN2010 .
C 02OCT2009 3
D 01JAN2010 .
D 02NOV2009 2
D 03OCT2009 1
D 04AUG2009 2
D 05MAY2009 3
D 06APR2009 1
;

Upvotes: 4

Ken Edmonds
Ken Edmonds

Reputation: 121

I can't think of a simpler way to do it, but this works. You will need to have your data sorted by id.

data my_table2;
    set my_table;
    by id;
    format gt2flag $1.;
    retain gt2flag;
    
    if first.id then gt2flag='';
    
    if my_col gt 2 then gt2flag='Y';
    
    if gt2flag = 'Y' then desired=.;
    else desired=1;
    
    drop gt2flag;
run;
id  my_date my_col  desired
A   01JAN2010   .   1
A   02JAN2010   0   1
A   03DEC2009   1   1
A   04NOV2009   1   1
B   01JAN2010   .   1
B   02NOV2009   2   1
C   01JAN2010   .   1
C   02OCT2009   3   .
D   01JAN2010   .   1
D   02NOV2009   2   1
D   03OCT2009   1   1
D   04AUG2009   2   1
D   05MAY2009   3   .
D   06APR2009   1   .

Upvotes: 1

Related Questions