SAS: Creating dummy variables from categorical variable

Question

I would like to turn the following long dataset:

data test;
input Id Injury $;
datalines;
1         Ankle
1         Shoulder 
2         Ankle
2         Head
3         Head
3         Shoulder
;
run;

Into a wide dataset that looks like this:

ID  Ankle Shoulder Head
1   1     1        0
2   1     0        1
3   0     1        1'

This answer seemed the most relevant but was falling over at the proc freq stage (my real dataset is around 1 million records, and has around 30 injury types): Creating dummy variables from multiple strings in the same row

Additional help: https://communities.sas.com/t5/SAS-Statistical-Procedures/Possible-to-create-dummy-variables-with-proc-transpose/td-p/235140

Thanks for the help!

Reeza · Accepted Answer

Here's a basic method that should work easily, even with several million records. First you sort the data, then add in a count to create the 1 variable. Next you use PROC TRANSPOSE to flip the data from long to wide. Then fill in the missing values with a 0. This is a fully dynamic method, it doesn't matter how many different Injury types you have or how many records per person. There are other methods that are probably shorter code, but I think this is simple and easy to understand and modify if required.

data test;
input Id Injury $;
datalines;
1         Ankle
1         Shoulder 
2         Ankle
2         Head
3         Head
3         Shoulder
;
run;

proc sort data=test;
by id injury;
run;

data test2;
set test;
count=1;
run;

proc transpose data=test2 out=want prefix=Injury_;
by id;
var count;
id injury;
idlabel injury;
run;

data want;
set want;
array inj(*) injury_:;

do i=1 to dim(inj);
    if inj(i)=. then inj(i) = 0;
end;

drop _name_ i;
run;

SAS: Creating dummy variables from categorical variable

Answers (2)

EDIT

Related Questions