Reputation: 1
I have the dataset with Time and Interval variable as below. I would like to add a sequential ID (Indicator) with SAS based on a condition that Interval is greater than 0.1, as follows:
Time | Interval | Indicator |
---|---|---|
11:40:38 | 0.05 | . |
11:40:41 | 0.05 | . |
11:40:44 | 0.05 | . |
11:40:47 | 0.05 | . |
11:40:50 | 0.05 | . |
11:42:50 | 2 | 1 |
11:42:53 | 0.05 | 2 |
11:42:56 | 0.05 | 3 |
11:42:59 | 0.05 | 4 |
11:43:02 | 0.05 | 5 |
11:43:05 | 0.05 | 6 |
11:43:08 | 0.05 | 7 |
11:43:18 | 0.16667 | 1 |
11:43:21 | 0.05 | 2 |
11:43:24 | 0.05 | 3 |
11:43:27 | 0.05 | 4 |
11:43:30 | 0.05 | 5 |
11:43:33 | 0.05 | 6 |
If I use the code
`data out1; set out ;
by Time;
retain indicator;
if Interval > 0.1 then indicator=1;
indicator+1;
run;`
Indicator is not missing for the first five observations. I would like that it starts counting only when the condition is met (Interval > 0.1).
Thanks!
Upvotes: 0
Views: 335
Reputation: 27508
For yucks, here is a one-liner of @WhyMath logic.
data want;
set have;
retain seq;
seq = ifn(interval > 0.1, 1, ifn(seq, sum(seq,1), seq));
run;
Upvotes: 1
Reputation: 51621
If you want to retain INDICATOR it cannot be on the input dataset, otherwise the SET statement will overwrite the retained value with the value read from the existing dataset.
If you want INDICATOR to start as missing when using the SUM statement then you need to explicitly say so in the RETAIN statement. Otherwise the SUM statement will cause the variable to be initialized to zero.
If looks like you only want to increment when the new variable has already been assigned at least one value.
data want;
set have;
retain new .;
if interval>0.1 then new=1;
else if new > 0 then new+1;
run;
Results:
OBS Time Interval Indicator new
1 11:40:38 0.05000 . .
2 11:40:41 0.05000 . .
3 11:40:44 0.05000 . .
4 11:40:47 0.05000 . .
5 11:40:50 0.05000 . .
6 11:42:50 2.00000 1 1
7 11:42:53 0.05000 2 2
8 11:42:56 0.05000 3 3
9 11:42:59 0.05000 4 4
10 11:43:02 0.05000 5 5
11 11:43:05 0.05000 6 6
12 11:43:08 0.05000 7 7
13 11:43:18 0.16667 1 1
14 11:43:21 0.05000 2 2
15 11:43:24 0.05000 3 3
16 11:43:27 0.05000 4 4
17 11:43:30 0.05000 5 5
18 11:43:33 0.05000 6 6
Upvotes: 0
Reputation: 1394
You can do it with a little modification:
data out1;
set out ;
retain indicator;
if Interval>0.1 then indicator=0;
if indicator^=. then indicator+1;
run;
The summuation will start after the condition Interval>0.1
has been met, because indicator is equal to missing value before that, so indicator+1
would not be calculated.
And you need to initial indicator as 0, not 1. If indicator is equal to 0, indicator^=.
will be satisfied and indicator+1
will be calculated.
Upvotes: 1