phdstudent
phdstudent

Reputation: 1144

Deaccumulate values in sas

I have a SAS dataset that looks like the following:

  Year  Qtr Variable
2009    1   10
2009    2   15
2009    3   20
2009    4   25
2010    1   5
2010    2   7
2010    3   9
2010    4   12
2011    1   14
2011    2   17
2011    3   21
2011    4   26

The variable gets accumulated over quarters. I want to replace that variable with the actual increase during the quarter i.e., I want the table to look like:

Year    Qtr Variable
2009    1   10
2009    2   5
2009    3   5
2009    4   5
2010    1   5
2010    2   2
2010    3   2
2010    4   3
2011    1   14
2011    2   3
2011    3   4
2011    4   5

If it is the first quarter leave the value unchanged, otherwise subtract the previous value. Is there any command to do this in SAS?

Also, not sure if it helps, but this is what I am trying to do:

data ppe;
set ppeqrt;
by gvkey;
where (sppey>=0 or sppey=.) and sppey not in (1,2,3,7,.c,.i,.a,.s);
year=year(datadate);
qtr=qtr(datadate);
keep ppentq sppey year qtr;

**** Here I am missing the differences of sppey by year and quarter whithin each gvkey (I did not include this variable above not to burden this) ******    


proc means data=ppe noprint;
var ppentq sppey; 
class year qtr;
types year*qtr;
output out=ppesums mean= sum=/autoname;

Upvotes: 1

Views: 68

Answers (3)

Søren Lassen
Søren Lassen

Reputation: 61

Just a note about the DIF function: As shown, using the IFN function, it works as intended. But only because the IFN function always calculates both output values before choosing one. In other words, this works:

variable = ifn(first.year,variable,dif(variable));

but this does not:

if not first.year then variable=dif(variable);

because the DIF function should be called every time to work as intended here, as the DIF and LAG functions work by storing value(s) from previous calls in an internal stack.

Upvotes: 0

Joe
Joe

Reputation: 63424

Similar to lag is the dif function which is precisely what you need I think. It performs both the lag and the subtraction at the same time. I show ifn here which allows dif (or lag) to work as desired in this setup (but you could just as easily do it in a separate variable like Andrew does).

data have;
input Year  Qtr Variable;
datalines;
2009    1   10
2009    2   15
2009    3   20
2009    4   25
2010    1   5
2010    2   7
2010    3   9
2010    4   12
2011    1   14
2011    2   17
2011    3   21
2011    4   26
;;;;
run;

data want;
  set have;
  by year qtr;
  variable = ifn(first.year,variable,dif(variable));
run;

If you have another grouping variable the grouping variable goes before year in the by statement.

Upvotes: 2

Andrew Haynes
Andrew Haynes

Reputation: 2640

Assuming your data are sorted by Year (if not just proc sort them), then the lag function works well:

data want;
   set have;
   by year;

   lag_var = lag(variable);

   if first.year ^= 1 then Variable = Variable - lag_var;

   drop lag_var;
run;

Upvotes: 2

Related Questions