Reputation: 475
I have the following panel data:
id date returns
1 Jan 09 -0.07142857
1 Feb 09 -0.09615385
1 Mrz 09 0.03273322
1 Apr 09 0.14896989
1 May 09 0.06620690
1 Jun 09 -0.01811125
1 Jul 09 -0.07142857
1 Aug 09 -0.09615385
1 Sep 09 0.03273322
1 Oct 09 0.14896989
1 Nov 09 0.06620690
1 Dez 09 -0.01811125
2 Aug 09 -0.09615385
2 Sep 09 0.03273322
2 Oct 09 0.14896989
2 Nov 09 0.06620690
2 Dez 09 -0.01811125
What I would like to get is a new column that has the annualized returns of the individual IDs for each year. If the firm doesnt have full 12 returns for a year such as id 2 in the example the annualized return should be based on the months that are available: e.g. RETannual = prod(1+RETmonthly)^(1/5)
The output should then look like this:
id date returns RETan
1 Jan 09 -0.07142857
1 Feb 09 -0.09615385
1 Mrz 09 0.03273322
1 Apr 09 0.14896989
1 May 09 0.06620690
1 Jun 09 -0.01811125
1 Jul 09 -0.07142857
1 Aug 09 -0.09615385
1 Sep 09 0.03273322
1 Oct 09 0.14896989
1 Nov 09 0.06620690
1 Dez 09 -0.01811125 0.00697433
2 Aug 09 -0.09615385
2 Sep 09 0.03273322
2 Oct 09 0.14896989
2 Nov 09 0.06620690
2 Dez 09 -0.01811125 0.023432056
Upvotes: 1
Views: 64
Reputation: 1101
Using data.table
you can try
df<- read.table(stringsAsFactors = FALSE, header = TRUE, text ="id date returns
1 Jan-09 -0.07142857
1 Feb-09 -0.09615385
1 Mrz-09 0.03273322
1 Apr-09 0.14896989
1 May-09 0.06620690
1 Jun-09 -0.01811125
1 Jul-09 -0.07142857
1 Aug-09 -0.09615385
1 Sep-09 0.03273322
1 Oct-09 0.14896989
1 Nov-09 0.06620690
1 Dez-09 -0.01811125
2 Aug-09 -0.09615385
2 Sep-09 0.03273322
2 Oct-09 0.14896989
2 Nov-09 0.06620690
2 Dez-09 -0.01811125")
library(data.table)
setDT(df)[, .(RETan = prod(1+returns)^(1/.N)), by = id]
#returns
id RETan
1: 1 1.006974
2: 2 1.023432
Of course, I'm not obtaining the same format as you, to do that you can try:
setDT(df)[, .(date = date, RETan = c(rep(NA,.N-1),prod(1+returns)^(1/.N))), by = id]
#returns
id date RETan
1: 1 Jan-09 NA
2: 1 Feb-09 NA
3: 1 Mrz-09 NA
4: 1 Apr-09 NA
5: 1 May-09 NA
6: 1 Jun-09 NA
7: 1 Jul-09 NA
8: 1 Aug-09 NA
9: 1 Sep-09 NA
10: 1 Oct-09 NA
11: 1 Nov-09 NA
12: 1 Dez-09 1.006974
13: 2 Aug-09 NA
14: 2 Sep-09 NA
15: 2 Oct-09 NA
16: 2 Nov-09 NA
17: 2 Dez-09 1.023432
Upvotes: 1
Reputation: 887691
We can do a group by operation
library(tidyverse)
library(zoo)
df1 %>%
group_by(id, year = year(as.yearmon(date, format = "%b %y"))) %>%
mutate(RETan =prod(1+returns)^(1/n()),
RETan = replace(RETan, row_number() < n(), NA_real_))
Upvotes: 1