Reputation: 33
ID no. 6 returns 'Underweight' when it should return 'Overweight'. The lines of code in question, lines 30-53, contain nested if statements and produce a single undesirable result. The purpose of these lines is to partition the sample into classifications of weight class: Underweight, Average or Overweight.
TITLE 'A SHORT SAS PROGRAM';
OPTIONS LS=72;
* Create data file with height and weight data;
DATA HTWT;
INPUT ID GENDER $ HEIGHT WEIGHT;
DATALINES;
1 M 68.5 155.0
2 F 61.2 99.00
3 F 63.0 115.0
4 M 70.0 205.0
5 M 68.6 170.0
6 F 65.1 125.0
7 M 72.4 220.0
;
* Create a new categorical variable for height;
DATA HTWT;
SET work.HTWT;
IF HEIGHT < 68 THEN
STATURE='Short';
IF HEIGHT >=68 THEN
STATURE='Tall';
RUN;
* Create a new categorical variable for weight;
DATA HTWT;
SET work.HTWT;
IF GENDER='M' THEN
IF WEIGHT > 170 THEN
WEIGHT_CLASS='Overweight';
IF 170 >=WEIGHT >=150 THEN
WEIGHT_CLASS='Average';
IF WEIGHT < 140 THEN
WEIGHT_CLASS='Underweight';
ELSE IF GENDER='F' THEN
IF WEIGHT > 120 THEN
WEIGHT_CLASS='Overweight';
IF 120 >=WEIGHT >=100 THEN
WEIGHT_CLASS='Average';
IF WEIGHT < 100 THEN
WEIGHT_CLASS='Underweight';
RUN;
* Changing units of height from inches to centimeters;
DATA HTWT;
SET work.HTWT;
HEIGHT=2.54 * HEIGHT;
RUN;
* Creates HEALTH_INDEX;
DATA HTWT;
SET work.HTWT;
HEALTH_INDEX=WEIGHT/HEIGHT;
RUN;
* Print the data file HTWT;
PROC PRINT DATA=HTWT;
TITLE 'HEIGHT AND WEIGHT DATA';
RUN;
* Sorts the data by gender. Some procedures require sorted data;
PROC SORT DATA=HTWT OUT=sorted;
BY GENDER;
RUN;
* Print the sorted data file;
PROC PRINT DATA=sorted;
TITLE 'GENDER SORTED HTWT DATA';
RUN;
Upvotes: 0
Views: 5466
Reputation: 1
If statements are useful but sometimes it is safer to create a format and use put statements to do the assignments. This is particularly true if you would have to do a lot of IFs since in my case at least, the more typing I do, the greater is my chance of screwing up.
The code with the put statements is a lot easier to check and maintain.
Upvotes: 0
Reputation: 669
Please, don't accept my answer as the correct one, 'cause I'm late and previos answers are clear and useful. But I'm here to give you a simple advice, when you have to concatenate a lot of nested if statement, please use select when statement (coming from sql language) to give a stratum to all possible variable's value.
data want;
set have;
if gender='M' then do;
select (weight);
when (>x) weight_class='';
when (<y) weight_class='';
otherwise weight_class=''
end;
end;
else if gender='F' then do;
select (weight);
when (>x) weight_class='';
when (<y) weight_class='';
otherwise weight_class=''
end;
end;
run;
Upvotes: 1
Reputation: 336
Modify your nested if statement into:
IF GENDER='M' THEN do;
IF WEIGHT > 170 THEN
WEIGHT_CLASS='Overweight';
else if 150 <= Weight <= 170 THEN
WEIGHT_CLASS='Average';
else
WEIGHT_CLASS='Underweight';
end;
else do;
IF WEIGHT > 120 THEN
WEIGHT_CLASS='Overweight';
else iF 100 <= Weight =< 120 THEN
WEIGHT_CLASS='Average';
else
WEIGHT_CLASS='Underweight';
end;
See if it works?
Upvotes: 1
Reputation: 51566
Consecutive IF statements are processed in order. So since the last one is testing if WEIGHT is < 100 then of course number 6 is being listed as underweight. If you want to control execution order add some ELSE and/or DO/END blocks. Your basic structure probably should look like this.
if A then do;
if B then xxx ;
else if C then xxx ;
else if D then xxxx ;
end;
Upvotes: 0