Reputation: 328
I have the below Data set:
DATA Test1;
INFILE DATALINES DSD MISSOVER;
INPUT A B C;
DATALINES;
1, 2, 3
1, ,
, , 3
, ,
;
RUN;
Then I create below 2 datasets:
DATA Test2;
SET Test1;
i + a + b + c;
RUN;
DATA Test3;
SET Test1;
j + a + b + c + i;
RUN;
In Test2, the value of i is always 6 in each observation.
In Test3, the value of J is all 0 and i is a period.
I have tried to think how SAS behaves in this case, but can't get my head around it. Can someone help?
Upvotes: 0
Views: 43
Reputation: 51566
Your data steps are using examples of the sum statement. The form for a sum statement is:
variable + expression ;
which is basically the equivalent of
retain variable 0;
variable=sum(variable,expression);
So in your first example
i + a + b + c;
the expression is (A+B+C). Note that A+B+C will be missing for every observation except the first one because one or more of the variables are missing. So the 6 from the first observation is retained but nothing is added to it. In the second case you reference a non-existent variable I that will have missing values for all observations. (perhaps you meant for that step to read from the second dataset?) So then (a+b+c+i) is missing for every observation and thus J keeps its initial value of zero.
If you really wanted I to accumulate all of the non-missing values then you should have used.
i + sum(a,b,c);
Upvotes: 2