Jonathan Wilson
Jonathan Wilson

Reputation: 686

concatenating two datasets and getting unexpected results in SAS

I came across a rather weird situation where I concatenated two datasets and got really unexpected results. Here's the setup:

data a;
    do i=1 to 100;
        output;
    end;
run;


data b;
    do j=5 to 79;
        output;
    end;
run;

data c;
    set a b;
    if j=. then j=i;
run;

Can someone explain to me why the j takes on the value of "1" for the first 100 observations? It looks like j is being retained, but even weirder, if I change the conditional to:

data c;
    set a b;
    if i=. then i=j;
run;

Then a "retain" statement" would imply that after the 100th observation, you should see 100 retained down, but it isn't! It's 5! What is going on?

Thanks,

Upvotes: 1

Views: 59

Answers (1)

Tom
Tom

Reputation: 51601

All variables that are sourced from input datasets are "retained". You just don't normally notice it because the values are overwritten when the SET statement executes. In your example dataset B does not contribute to the first few observations so the value you put into the variable J on the first iteration of the data step is not overwritten until you start reading from dataset B.

The other wrinkle is that when you do make the transition from reading dataset A to reading dateset B then the variables that are sourced from A are cleared.

Adding some PUT statements to your data step so you can watch what is happening will make it clearer.

data both;
  put / 'TOP   :' (_n_ i j) (=);
  set a(obs=3) b(obs=3) ;
  put 'MIDDLE:' (_n_ i j) (=);
  if missing(j) then j=i ;
  put 'BOTTOM:' (_n_ i j) (=);
run;

Results

TOP   :_N_=1 i=. j=.
MIDDLE:_N_=1 i=1 j=.
BOTTOM:_N_=1 i=1 j=1

TOP   :_N_=2 i=1 j=1
MIDDLE:_N_=2 i=2 j=1
BOTTOM:_N_=2 i=2 j=1

TOP   :_N_=3 i=2 j=1
MIDDLE:_N_=3 i=3 j=1
BOTTOM:_N_=3 i=3 j=1

TOP   :_N_=4 i=3 j=1
MIDDLE:_N_=4 i=. j=5
BOTTOM:_N_=4 i=. j=5

TOP   :_N_=5 i=. j=5
MIDDLE:_N_=5 i=. j=6
BOTTOM:_N_=5 i=. j=6

TOP   :_N_=6 i=. j=6
MIDDLE:_N_=6 i=. j=7
BOTTOM:_N_=6 i=. j=7

TOP   :_N_=7 i=. j=7

Upvotes: 3

Related Questions