Reputation: 167
I've got a dependent variable with 8 missing values. It's currently a quantitative variable. However, I want to bucketize them into the median and below; and above the median, but while preserving the 8 missing values. The following code is replacing the missing values with zeroes and I don't understand why.
data baseline_all_disc2;
set baseline_all3;
if health_state_m eq . then health_state_m_disc=.; /*This line of code doesn't seem to be working*/
if health_state_m LE 60 then health_state_m_disc=0;
else health_state_m_disc=1;
run;
Please help!
Upvotes: 0
Views: 832
Reputation: 21294
You need to be using IF/ELSE IF, not just multiple IF statements. Your code is working correctly the way you've shown it.
First IF -> health_state_m_disc is set to missing. Second IF -> LE 60 - Missing is considered less than so this evaluates as true as well. Switch to using IF/ELSE IF to avoid the second IF statement every being evaluated.
Adding the ELSE, this will work.
data baseline_all_disc2;
set baseline_all3;
if health_state_m eq . then health_state_m_disc=.; /*This line of code doesn't seem to be working*/
ELSE if health_state_m LE 60 then health_state_m_disc=0;
else health_state_m_disc=1;
run;
EDIT: another option if you have multiple missing values coded, ie ., .A-.Z A benefit of the MISSING() function is that it works on both character and numeric variables the same.
data baseline_all_disc2;
set baseline_all3;
if missing(health_state_m) then call missing(health_state_m_disc); /*This line of code doesn't seem to be working*/
ELSE if health_state_m LE 60 then health_state_m_disc=0;
else health_state_m_disc=1;
run;
Upvotes: 1
Reputation: 12909
Missing values are considered less than numeric values. The line if health_state_m LE 60 then health_state_m_disc=0;
is changing it to 0. For your second if
statement, add a missing value check.
if(NOT missing(health_state_m) AND health_state_m LE 60) then health_state_m_disc=0;
Upvotes: 1