Reputation: 749
I have the following data
data have;
input id seq value;
datalines;
1 1 4
1 2 4
1 3 0
1 4 0
1 5 0
1 6 4
1 7 4
2 1 1
2 2 1
2 3 5
2 4 5
2 5 5
2 6 8
;
run;
I need to create a groupid variable, which depends on the id and value, so that the output looks like this,
id seq value grpid
1 1 4 1
1 2 4 1
1 3 0 2
1 4 0 2
1 5 0 2
1 6 4 3
1 7 4 3
2 1 1 1
2 2 1 1
2 3 5 2
2 4 5 2
2 5 5 2
2 6 8 3
I have no idea how to achieve this, the error that I run into is this,
ERROR: BY variables are not properly sorted on data set
But I cannot change the sorting, the dataset needs to be sorted by id and seq variables first before generating the grpid.
Upvotes: 0
Views: 176
Reputation: 27498
The BY
statement has option NOTSORTED
to allow you to use first.
and last.
to process rows that are contiguous but otherwise disordered.
data want;
set have;
by id value NOTSORTED;
if first.id then grpid = 1;
else
if first.value then grpid + 1;
run;
Upvotes: 1
Reputation: 4937
Try this
data have;
input id seq value;
datalines;
1 1 4
1 2 4
1 3 0
1 4 0
1 5 0
1 6 4
1 7 4
2 1 1
2 2 1
2 3 5
2 4 5
2 5 5
2 6 8
;
run;
data want;
set have;
by id value notsorted;
if first.id then grpid = 0;
if first.value then grpid + 1;
run;
Upvotes: 1