Reputation: 1144
I have a SAS dataset that looks like the following:
Year Qtr Variable
2009 1 10
2009 2 15
2009 3 20
2009 4 25
2010 1 5
2010 2 7
2010 3 9
2010 4 12
2011 1 14
2011 2 17
2011 3 21
2011 4 26
The variable gets accumulated over quarters. I want to replace that variable with the actual increase during the quarter i.e., I want the table to look like:
Year Qtr Variable
2009 1 10
2009 2 5
2009 3 5
2009 4 5
2010 1 5
2010 2 2
2010 3 2
2010 4 3
2011 1 14
2011 2 3
2011 3 4
2011 4 5
If it is the first quarter leave the value unchanged, otherwise subtract the previous value. Is there any command to do this in SAS?
Also, not sure if it helps, but this is what I am trying to do:
data ppe;
set ppeqrt;
by gvkey;
where (sppey>=0 or sppey=.) and sppey not in (1,2,3,7,.c,.i,.a,.s);
year=year(datadate);
qtr=qtr(datadate);
keep ppentq sppey year qtr;
**** Here I am missing the differences of sppey by year and quarter whithin each gvkey (I did not include this variable above not to burden this) ******
proc means data=ppe noprint;
var ppentq sppey;
class year qtr;
types year*qtr;
output out=ppesums mean= sum=/autoname;
Upvotes: 1
Views: 68
Reputation: 61
Just a note about the DIF function: As shown, using the IFN function, it works as intended. But only because the IFN function always calculates both output values before choosing one. In other words, this works:
variable = ifn(first.year,variable,dif(variable));
but this does not:
if not first.year then variable=dif(variable);
because the DIF function should be called every time to work as intended here, as the DIF and LAG functions work by storing value(s) from previous calls in an internal stack.
Upvotes: 0
Reputation: 63424
Similar to lag
is the dif
function which is precisely what you need I think. It performs both the lag and the subtraction at the same time. I show ifn
here which allows dif
(or lag
) to work as desired in this setup (but you could just as easily do it in a separate variable like Andrew does).
data have;
input Year Qtr Variable;
datalines;
2009 1 10
2009 2 15
2009 3 20
2009 4 25
2010 1 5
2010 2 7
2010 3 9
2010 4 12
2011 1 14
2011 2 17
2011 3 21
2011 4 26
;;;;
run;
data want;
set have;
by year qtr;
variable = ifn(first.year,variable,dif(variable));
run;
If you have another grouping variable the grouping variable goes before year
in the by
statement.
Upvotes: 2
Reputation: 2640
Assuming your data are sorted by Year (if not just proc sort them), then the lag function works well:
data want;
set have;
by year;
lag_var = lag(variable);
if first.year ^= 1 then Variable = Variable - lag_var;
drop lag_var;
run;
Upvotes: 2