Reputation: 119
I would like to make a new variable in my dataset. This variable is just a binary variable if someone has a tobacco disease or not. I am looking at patient data with each patient having up to 9 disease codes. I have a dataset called tobacco that stores all the tobacco disease codes.
This is what I thought I could do:
data outpreg;
set outpreg;
if diag1 = tobacco OR diag2 = tobacco OR diag3 = tobacco or diag4 = tobacco or diag5 = tobacco or diag6 = tobacco or
diag7 = tobacco or diag8 = tobacco or diag9 = tobacco then co2=1;
run;
But this is giving me too many for it to be correct. Any help would be greatly appreciated.
Upvotes: 0
Views: 202
Reputation: 2762
It's not doing what you want to do. Your current code is trying to compare the value of diag1
to a variable named tobacco
in the same outpreg
data set. Since there is no variable tobacco
, SAS is creating a new variable tobacco
and initializing it to missing .
. In order to do what you want, I would join the outpreg
data set to the tobacco
dataset for each diag
variable.
proc sql;
select
o.*,
t1.tobacco_cd is not null or
t2.tobacco_cd is not null or
t3.tobacco_cd is not null as co2
from
outpreg as o
left join tobacco as t1
on o.diag1 = t1.tobacco_cd
left join tobacco as t2
on o.diag2 = t2.tobacco_cd
left join tobacco as t3
on o.diag3 = t3.tobacco_cd
;
quit;
This checks each diag
variable against the list of codes, setting co2
to 1
if it matches, and 0
if it doesn't. For example, if diag1
matches, then t1.tobacco_cd is not null
would be true, and the entire expression evaluates to 1
.
You'd have to expand it to cover all nine of your variables instead of just three.
Another option is to put your tobacco codes into a format like Joe suggested in this question.
proc format;
value $tobaccocd
'30300','30301','30302','30303'= 'Tobacco'
other='Not Tobacco';
quit;
Then you could create your co2
variable in a data step like this:
data outpreg2;
set outpreg;
if put(diag1,$tobaccocd.) = 'Tobacco' or
put(diag2,$tobaccocd.) = 'Tobacco' or
put(diag3,$tobaccocd.) = 'Tobacco' then co2 = 1;
else co2 = 0;
run;
Upvotes: 1