mustafghan
mustafghan

Reputation: 171

Sas calculation program doesn't run

I have the following data set:

Date    jobboardid       Sales
Jan05     3              256    
Jan05     6              70
Jan05     54             90    
Feb05     32             456
Feb05     11             89
Feb05     16             876
March05
April05
. 
.
.
Jan06     6              678
Jan06     54             87
Jan06     13             56 
Feb06     McDonald       67
Feb06     11             281
Feb06     16             876
March06
April06
.
.
.
Jan07    6               567
Jan07    54              76
Jan07    34              87
Feb07    10              678
Feb07    11              765
Feb07    16              67
March07   
April06

I am trying to calculate a 12 month growth rate for Sales column when jobboardid column has the same value 12 months apart. I have the following code:

data Want;
set Have;
by Date jobboardid;
format From Till monyy7.;
from = lag12(Date);
oldsales = lag12(sales);

if lag12 (jobboardid) EQ jobboardid 
and INTCK('month', from, Date) EQ 12 then do;
    till = Date;
    rate =  (sales - oldsales) / oldsales;
    output;
end;
run;

However I keep getting the following error message: Note: Missing values were created as a result of performing operation on missing values.

But when I checked my dataset, there aren't any missing values. What's the problem?

Note: My date column is in monyy7. format. jobboardid is numeric value and so does the Sales.

Upvotes: 0

Views: 78

Answers (1)

Quentin
Quentin

Reputation: 6378

The NOTE is being thrown by the INTCK() function. When you say from=lag12(date) the first 12 records will have a missing value for from. And then INTCK('month', from, Date) will throw the NOTE. Even though INTCK is not used in an assignment statement, it still throws the NOTE because one of its arguments has a missing value. Below is an example. The log reports that missing values were created 12 times, because I used lag12.

77   data have;
78     do Date=1 to 20;
79       output;
80     end;
81   run;

NOTE: The data set WORK.HAVE has 20 observations and 1 variables.

82   data want;
83     set have;
84     from=lag12(Date);
85     if intck('month',from,today())=. then put 'Missing: ' (_n_ Date)(=);
86     else put 'Not Missing: ' (_n_ Date)(=);
87   run;

Missing: _N_=1 Date=1
Missing: _N_=2 Date=2
Missing: _N_=3 Date=3
Missing: _N_=4 Date=4
Missing: _N_=5 Date=5
Missing: _N_=6 Date=6
Missing: _N_=7 Date=7
Missing: _N_=8 Date=8
Missing: _N_=9 Date=9
Missing: _N_=10 Date=10
Missing: _N_=11 Date=11
Missing: _N_=12 Date=12
Not Missing: _N_=13 Date=13
Not Missing: _N_=14 Date=14
Not Missing: _N_=15 Date=15
Not Missing: _N_=16 Date=16
Not Missing: _N_=17 Date=17
Not Missing: _N_=18 Date=18
Not Missing: _N_=19 Date=19
Not Missing: _N_=20 Date=20
NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      12 at 85:6
NOTE: There were 20 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 20 observations and 2 variables.

One way to avoid the problem would be to add another do block something like (untested):

if lag12 (jobboardid) EQ jobboardid and _n_> 12 then do;
  if INTCK('month', from, Date) EQ 12 then do;
      till = Date;
      rate =  (sales - oldsales) / oldsales;
      output;
  end;
end; 

Upvotes: 1

Related Questions