Reputation: 3
given dataset 'temp' looks like this..
index | code1 | code2 | code3 |
---|---|---|---|
A | P1 | P2 | P3 |
B | P1 | P3 | P4 |
C | P2 | P4 | N1 |
then I want to make new dataset like this
index | P1 | P2 | P3 | P4 | n1 |
---|---|---|---|---|---|
A | 1 | 1 | 1 | 0 | 0 |
B | 1 | 0 | 1 | 1 | 0 |
C | 0 | 1 | 0 | 1 | 1 |
My code is here...
%macro freq;
%do i = 1 %to 3;
%do j = 1 %to 5;
if substr(code&i.,1,1) = "P" then
if input(substr(code&i.,2,1),1.) = &j. then p&j. = 1;
if substr(code&i.,1,1) = "N" then
if input(substr(code&i.,2,1),1.) = &j. then n&j. = 1;
%end;
%end;
%mend;
But it's not cool :(
How can I create a new column whose name is the value of variables(code1, code2,...)?
Is there any other simple way?
Upvotes: 0
Views: 2637
Reputation: 12849
One more alternative:
proc transpose data=have out=long;
by index;
var code:;
run;
data long2;
set long;
value = 1;
run;
proc transpose data=long2 out=wide;
by index;
id col1;
var value;
run;
/* Convert missing to zeroes */
data want;
set wide;
array vars _NUMERIC_;
do over vars;
if(vars = .) then vars = 0;
end;
drop _NAME_;
run;
Output:
index P1 P2 P3 P4 N1
A 1 1 1 0 0
B 1 0 1 1 0
C 0 1 0 1 1
Upvotes: 0
Reputation: 51566
The simplest way to convert values into variable names is via PROC TRANSPOSE. So first convert your wide dataset into a tall dataset. You could use PROC TRANSPOSE to do that, but to make your target dataset PROC TRANSPOSE will need some numeric variable to transpose. So why not use a data step to make the tall dataset and include a numeric variable that is set to 1.
The PROC TRANSPOSE step will give you a dataset with either a 1 or a missing value for the new variables. You can use PROC STDIZE to change the missing values into zeros.
data have;
input index $ (code1-code3) (:$32.) ;
cards;
A P1 P2 P3
B P1 P3 P4
C P2 P4 N1
;
data tall;
set have ;
array code code1-code3;
length _name_ $32 dummy 8;
retain dummy 1;
do column=1 to dim(code);
_name_=code[column];
if not missing(_name_) then output;
end;
run;
proc transpose data=tall out=want(drop=_name_);
by index ;
id _name_;
var dummy;
run;
proc stdize reponly missing=0 data=want ;
var _numeric_;
run;
Upvotes: 0
Reputation: 585
You can achieve this without a macro by using ARRAY
and the VNAME
function in a DATA
step.
data want;
set have;
/* Initialize flag variables. */
length P1-P4 3 N1 3;
/* Define arrays. */
array code [*] code1-code3;
array flags [*] P1-P4 N1;
/* Loop over the arrays. */
do i = 1 to dim(flags);
flags[i] = 0;
do j = 1 to dim(code);
if vname(flags[i]) = code[j] then flags[i] = 1;
end;
end;
keep index P1-P4 N1;
run;
Upvotes: 2
Reputation: 4937
How about
data have;
input (index code1 code2 code3)($);
datalines;
A P1 P2 P3
B P1 P3 P4
C P2 P4 N1
;
data temp;
set have;
array c code:;
do over c;
v = c;
d = 1;
output;
end;
run;
proc transpose data = temp out = want(drop = _:);
by index;
id v;
var d;
run;
Upvotes: 3