Reputation: 357
I have a dataset with variables as ID and Diagnosis. I want to assign serial numnber to diagnosis based on ID. For eg: If ID 1 has 5 diagnosis, I want serial numbers to be (1, 2, 3, 4,5) and then if ID 2 has 3 diagnosis, the serial numbers would be (1, 2,3) and so on.
Is there a way to do that in SAS? Thanks so much.
HI Thank you so much for replying. I actually want it like this.
Obs diagnosis id diagnosis_serial_number
1 XX 1 1
2 XX1 1 2
3 XX2 1 3
4 XX3 1 4
5 XX4 1 5
6 XX 2 1
7 XX1 2 2
8 XX2 2 3
Upvotes: 0
Views: 7591
Reputation: 949
Nupur, to get the output you just described, then try this:
proc sort data=temp;
by id diagnosis;
run;
data temp;
set temp;
by id diagnosis;
retain diagnosis_serial_number 0;
if first.id then do;
diagnosis_serial_number = 0;
end;
diagnosis_serial_number + 1;
run;
Upvotes: 1
Reputation: 949
Nupur, if you need to have the serial number for each diagnosis, then try the following:
proc sort data=temp;
by id diagnosis;
run;
data temp_serial_number;
set temp;
by id diagnosis;
format diagnosis_serial_number $30.;
retain diagnosis_serial_number count;
if first.id then do;
count = 0;
diagnosis_serial_number = "(";
end;
count + 1;
diagnosis_serial_number = trim(left(diagnosis_serial_number)) || trim(left(count));
if last.id then do;
diagnosis_serial_number = trim(left(diagnosis_serial_number)) || ")";
output;
end;
else do;
diagnosis_serial_number = trim(left(diagnosis_serial_number)) || ",";
end;
drop diagnosis count;
run;
data temp;
merge temp(in=in_temp) temp_serial_number(in=in_serial_nbr);
by id;
if in_temp then output;
run;
proc print data=temp;
run;
The output would look like:
Obs diagnosis id diagnosis_serial_number
1 XX 1 (1,2,3,4,5)
2 XX1 1 (1,2,3,4,5)
3 XX2 1 (1,2,3,4,5)
4 XX3 1 (1,2,3,4,5)
5 XX4 1 (1,2,3,4,5)
6 XX 2 (1,2,3)
7 XX1 2 (1,2,3)
8 XX2 2 (1,2,3)
Upvotes: 0
Reputation: 426
I'm a little unclear on exactly what you're asking for, but I think this might do the trick. It uses a retain statement and an increment step to make a counter variable called serial
, and uses by group processing to reset the counter each time a new ID group is found.
Note that your input dataset (called input_ds
here) must first be sorted by ID before using this code.
data output_ds;
retain serial;
set input_ds;
by ID;
if first.ID then serial = 0;
serial = serial + 1;
run;
Upvotes: 2