SarahB
SarahB

Reputation: 328

SAS producing strange output, need clarification

I have the below Data set:

DATA Test1;
INFILE DATALINES DSD MISSOVER;
INPUT A  B  C;
DATALINES;
1, 2, 3
1,  , 
 ,  , 3
 ,   ,
;
RUN;

Then I create below 2 datasets:

DATA Test2;
SET Test1;
i + a + b + c;
RUN;

DATA Test3;
SET Test1;
j + a + b + c + i;
RUN;

In Test2, the value of i is always 6 in each observation.

In Test3, the value of J is all 0 and i is a period.

I have tried to think how SAS behaves in this case, but can't get my head around it. Can someone help?

Upvotes: 0

Views: 43

Answers (1)

Tom
Tom

Reputation: 51566

Your data steps are using examples of the sum statement. The form for a sum statement is:

variable + expression ;

which is basically the equivalent of

retain variable 0;
variable=sum(variable,expression);

So in your first example

i + a + b + c;

the expression is (A+B+C). Note that A+B+C will be missing for every observation except the first one because one or more of the variables are missing. So the 6 from the first observation is retained but nothing is added to it. In the second case you reference a non-existent variable I that will have missing values for all observations. (perhaps you meant for that step to read from the second dataset?) So then (a+b+c+i) is missing for every observation and thus J keeps its initial value of zero.

If you really wanted I to accumulate all of the non-missing values then you should have used.

i + sum(a,b,c);

Upvotes: 2

Related Questions