Reputation: 325
I have a dataframe with a KEY/ID column, a year column, two variables V1 and V2.
KEY V1 V2 YEAR
1 10 5 1990
1 20 10 1991
1 30 15 1992
2 40 20 1990
2 50 25 1991
2 60 30 1992
I would like to compute the percent change for the values of V1 from one year to another one. That is, I would like to compute (V1[i+1]-V1[i])/V1[i] but only when the value in KEY[i+1] is equal to the value of KEY[i]. When they are different, I would like to get a NA.
KEY V1 V2 YEAR CHANGE
1 10 5 1990 1
1 20 10 1991 1
1 30 15 1992 NA
2 40 20 1990 0.25
2 50 25 1991 0.2
2 60 30 1992 NA
This is my attempt by using the Delt function from the quantmode package and ddply from plyr.
data$change <- ddply(data, "data$KEY", transform, DeltaCol=Delt(data$V1) )
Unfortunately, it doesn't do the trick.
Any help would be appreciated.
Upvotes: 0
Views: 361
Reputation: 263352
I don't know how to do it with ddply but it's pretty easy with ave
:
> dat$pctchg <- ave(dat$V1, dat$KEY, FUN=function(x) c( NA, diff(x)/x[-length(x)]) )
> dat
KEY V1 V2 YEAR pctchg
1 1 10 5 1990 NA
2 1 20 10 1991 1.00
3 1 30 15 1992 0.50
4 2 40 20 1990 NA
5 2 50 25 1991 0.25
6 2 60 30 1992 0.20
ave
works when you want a result that depends only on one vector within any number of categories. As far as I know you cannot have multiple vector calculations with ave nor do you have access to the factor levels within hte function. If you want the same calculation(s) on all of a group of vectors considered separately, then aggregate
is the best; and finally if you want calculations that each depend on on multiple vectors use either do.call(rbvind, by(dat ,cats, function))
or lapply( split(dat, cats), function)
Upvotes: 2