Learner_new
Learner_new

Reputation: 5

Group SAS rows based on conditions

I am new to SAS community and need your help for one of my analysis. I have a table like below:

ID VAR Group Breach Month
1 A 1 202001
2 A 1 202002
3 A 1 202003
4 A 1 202004
5 A 1 202005
6 A 1 202006
7 A 0 202007
8 A 1 202008
9 A 1 202009
10 A 0 202010
11 A 1 202011
12 A 0 202012
13 A 1 202101
14 A 1 202102
15 A 0 202103
16 A 1 202104
17 A 1 202105
18 A 1 202106
19 A 1 202107
20 A 0 202108
21 A 1 202109

I want only to select rows which has 2 consecutive breaches within a group A , only once in a block of consecutive 0's or 1's. So basically I want the following table as output:

ID VAR Group Breach Month Result
1 A 1 202001 No
2 A 1 202002 Yes
3 A 1 202003 No
4 A 1 202004 No
5 A 1 202005 No
6 A 1 202006 No
7 A 0 202007 No
8 A 1 202008 No
9 A 1 202009 Yes
10 A 0 202010 No
11 A 1 202011 No
12 A 0 202012 No
13 A 1 202101 No
14 A 1 202102 Yes
15 A 0 202103 No
16 A 1 202104 No
17 A 1 202105 Yes
18 A 1 202106 No
19 A 1 202107 No
20 A 0 202108 No
21 A 1 202109 No

Thanks in Advance!!

Upvotes: 0

Views: 234

Answers (2)

Richard
Richard

Reputation: 27516

A DOW loop that uses the BY group processing option 'NOTSORTED' can be used to compute your result. Variables that hold an assertion result (i.e. flag values) are often left simply as 0 meaning assertion was false, and 1 true.

Example (Draycut's data, extended to have consecutive breach=0 states at idvar=23):

data have;
input IDVAR Group $ Breach Month;
datalines;
1 A 1 202001
2 A 1 202002
3 A 1 202003
4 A 1 202004
5 A 1 202005
6 A 1 202006
7 A 0 202007
8 A 1 202008
9 A 1 202009
10 A 0 202010
11 A 1 202011
12 A 0 202012
13 A 1 202101
14 A 1 202102
15 A 0 202103
16 A 1 202104
17 A 1 202105
18 A 1 202106
19 A 1 202107
20 A 0 202108
21 A 1 202109
22 A 0 202110
23 A 0 202111
24 A 0 202112
;

data want;
  do _n_ = 1 by 1 until (last.breach);
    set have;
    by group breach notsorted;
    result = (_n_ = 2);
    output;
  end;
  label result = 'Flag for 1st consecutive repeated state of Breach';
run;

Upvotes: 1

PeterClemmensen
PeterClemmensen

Reputation: 4937

Welcome :-)

Try this

data have;
input IDVAR Group $ Breach Month;
datalines;
1 A 1 202001
2 A 1 202002
3 A 1 202003
4 A 1 202004
5 A 1 202005
6 A 1 202006
7 A 0 202007
8 A 1 202008
9 A 1 202009
10 A 0 202010
11 A 1 202011
12 A 0 202012
13 A 1 202101
14 A 1 202102
15 A 0 202103
16 A 1 202104
17 A 1 202105
18 A 1 202106
19 A 1 202107
20 A 0 202108
21 A 1 202109
;

data want;
   set have;
   by Group;
   if Breach = 1 then c + 1;
   if first.Group then c = 1;
   if Breach = 0 then c = 0;
   Result = ifc(c = 2, 'Yes', 'No');
run;

Upvotes: 1

Related Questions