Reputation: 2834
Suppose I have the following data
set.seed(123)
Company <- c(rep("Company 1",5),rep("Company 2",10))
Dates <- c(seq(as.Date("2014-09-01"), as.Date("2015-01-01"), by="months"),
seq(as.Date("2011-09-01"), as.Date("2012-06-01"), by="months"))
X.1 <- sample(c(0,0,5,10,15,20,25,30,40,50),size=15,replace=TRUE)
X.2 <- sample(c(0,0,5,10,15,20,25,30,40,50),size=15,replace=TRUE)
df <- data.frame(Dates,Company,X.1,X.2)
Dates Company X.1 X.2
1 2014-09-01 Company 1 50 0
2 2014-10-01 Company 1 50 5
3 2014-11-01 Company 1 25 15
4 2014-12-01 Company 1 30 5
5 2015-01-01 Company 1 0 40
6 2011-09-01 Company 2 15 0
7 2011-10-01 Company 2 30 15
8 2011-11-01 Company 2 5 30
9 2011-12-01 Company 2 10 0
10 2012-01-01 Company 2 5 20
11 2012-02-01 Company 2 0 5
12 2012-03-01 Company 2 15 0
13 2012-04-01 Company 2 15 30
14 2012-05-01 Company 2 10 40
15 2012-06-01 Company 2 0 10
What I am trying to do is find monthly growth rates for variables X.1
and X.2
within each company and bind those columns to the right side of the dataframe. The problem here is that the date ranges for each Company are different, which is why I am having trouble with this. Also, since I have 0s in the data Inf
and NA
s are okay.
Thanks for your help.
Upvotes: 1
Views: 971
Reputation: 15458
#I computed the growth using log: growth=log(X(t)/X(t-1)). If you want to
compute using (X(t)-X(t-1))/X(t-1), you can just use that. Also, for the first
period, it will be NA
.
#Assumption: the data are equally spaced for each company. You get Inf, if
your last period value is 0 and -Inf if your current period value is 0 (because we used log). The growth will be 0 (when current period is zero) if you don't use log (see second method)
library(dplyr)
df%>%
group_by(Company)%>%
mutate(gx_1=log(X.1/lag(X.1,1)),gx_2=log(X.1/lag(X.2,1))
)
Source: local data frame [15 x 6]
Groups: Company
Dates Company X.1 X.2 gx_1 gx_2
1 2014-09-01 Company 1 5 40 NA NA
2 2014-10-01 Company 1 30 5 1.7917595 -0.2876821
3 2014-11-01 Company 1 15 0 -0.6931472 1.0986123
4 2014-12-01 Company 1 40 10 0.9808293 Inf
5 2015-01-01 Company 1 50 50 0.2231436 1.6094379
6 2011-09-01 Company 2 0 40 NA NA
7 2011-10-01 Company 2 20 25 Inf -0.6931472
8 2011-11-01 Company 2 40 25 0.6931472 0.4700036
9 2011-12-01 Company 2 20 50 -0.6931472 -0.2231436
10 2012-01-01 Company 2 15 25 -0.2876821 -1.2039728
11 2012-02-01 Company 2 50 30 1.2039728 0.6931472
12 2012-03-01 Company 2 15 20 -1.2039728 -0.6931472
13 2012-04-01 Company 2 25 20 0.5108256 0.2231436
14 2012-05-01 Company 2 20 5 -0.2231436 0.0000000
15 2012-06-01 Company 2 0 0 -Inf -Inf
#without using log , i.e.
df%>%
group_by(Company)%>%
mutate(gx_1=((X.1-lag(X.1,1))/lag(X.1,1)),gx_2=((X.2-lag(X.2,1))/lag(X.2,1)))
Upvotes: 2