Reputation: 53
I'm using SAS to create new variables for a data set. I used this code to create a permanent data set, a temporary data set from the permanent data set, and the new variables:
libname HW4 'C:\Users\johns\Desktop\SAS'; (please note I changed the location name as it contains revealing information)
data work.ldl;
set hw4.ldldat;
delta_LDL = LDL_post - LDL_pre;
if LDL_pre = . then group = "";
else if LDL_pre<100 then group="Pre less than 100";
else if LDL_pre>100 then group="Pre greater than 100";
if LDL_post =. then group ="";
else if LDL_post<100 then group="Post less than 100";
else if LDL_post>100 then group="Post greater than 100";
run; I received this note in the log:
NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 4 at 4:26
Does this mean that I've done something wrong? Is there something wrong within my code?
Upvotes: 0
Views: 148
Reputation: 21274
No, it's a valid error and if you look at your log at that line number it's likely your delta calculation.
delta_LDL = LDL_post - LDL_pre;
But your IF statements is also accounting for missing, so clearly those values can be missing. SAS is telling you that for the delta calculation, if one value was missing it assigned it to missing. In SAS if you do certain calculations (operators) with missing the result is missing. With some functions, it treats missing as 0.
quick example:
data demo;
x=1;
y=2;
z=.;
a=x+y;
b=x+z;
c=y+z;
output;
a=sum(x, y);
b=sum(x, z);
c=sum(y, z);
output;
run;
title 'Row1 with sum operator, Row2 with sum function';
proc print data=demo;
run;
Row1 with sum operator, Row2 with sum function
Obs x y z a b c
1 1 2 . 3 . .
2 1 2 . 3 1 2
EDIT: You can correct for this in two ways, one is to return the missing and the second is to assume the missing is 0.
Option #1
if nmiss(ldl_post, ldl_pre) > 0 then delta_LDL = LDL_post - LDL_pre;
Option #2
delta_LDL = coalesce(LDL_post, 0) - coalesce(LDL_pre, 0);
Given the context of your question, with cholesterol levels it makes sense to use Option #1.
Upvotes: 0
Reputation: 51566
The NOTE does not necessarily mean that you have done something wrong, but there IS something wrong with your code.
The note is saying that there are 4 observations where either LDL_post
or LDL_pre
is missing and so the result is that Delta_LDL
is missing. You can eliminate the note by changing the code to test if they are missing before making the calculation.
The problem with your code is that you are overwriting the calculation of GROUP based on LDL_pre with the value based on LDL_post. Either make two GROUP variables or combine the logic into a single set of IF/ELSE IF/.../ELSE conditions. You also do not assign a value when the LDL value is exactly 100. Make sure to define the length of GROUP before using it. Your current code will cause SAS to define GROUP to have a length of $1 since the first thing you do is assign an empty string to it.
data work.ldl;
set hw4.ldldat;
length delta_LDL 8 group_pre group_post $30 ;
if nmiss(ldl_post,ldl_pre)=0 then delta_LDL = LDL_post - LDL_pre;
if LDL_pre = . then group_pre = "Missing";
else if LDL_pre<=100 then group_pre="Pre less than or equal to 100";
else if LDL_pre>100 then group_pre="Pre greater than 100";
if LDL_post =. then group_post ="Missing";
else if LDL_post<=100 then group_post="Post less than or equal to 100";
else if LDL_post>100 then group_post="Post greater than 100";
run;
Upvotes: 0