user9259005
user9259005

Reputation: 475

Calculating annualized return with uneven panel data set

I have the following panel data:

         id   date     returns  
         1    Jan 09 -0.07142857 
         1    Feb 09 -0.09615385 
         1    Mrz 09  0.03273322  
         1    Apr 09  0.14896989  
         1    May 09  0.06620690  
         1    Jun 09 -0.01811125 
         1    Jul 09 -0.07142857 
         1    Aug 09 -0.09615385 
         1    Sep 09  0.03273322  
         1    Oct 09  0.14896989  
         1    Nov 09  0.06620690  
         1    Dez 09  -0.01811125 

         2    Aug 09 -0.09615385 
         2    Sep 09  0.03273322  
         2    Oct 09  0.14896989  
         2    Nov 09  0.06620690  
         2    Dez 09 -0.01811125 

What I would like to get is a new column that has the annualized returns of the individual IDs for each year. If the firm doesnt have full 12 returns for a year such as id 2 in the example the annualized return should be based on the months that are available: e.g. RETannual = prod(1+RETmonthly)^(1/5)

The output should then look like this:

         id   date     returns     RETan
         1    Jan 09 -0.07142857 
         1    Feb 09 -0.09615385 
         1    Mrz 09  0.03273322  
         1    Apr 09  0.14896989  
         1    May 09  0.06620690  
         1    Jun 09 -0.01811125 
         1    Jul 09 -0.07142857 
         1    Aug 09 -0.09615385 
         1    Sep 09  0.03273322  
         1    Oct 09  0.14896989  
         1    Nov 09  0.06620690  
         1    Dez 09  -0.01811125  0.00697433


         2    Aug 09 -0.09615385 
         2    Sep 09  0.03273322  
         2    Oct 09  0.14896989  
         2    Nov 09  0.06620690  
         2    Dez 09 -0.01811125   0.023432056

Upvotes: 1

Views: 64

Answers (2)

Chriss Paul
Chriss Paul

Reputation: 1101

Using data.table you can try

df<- read.table(stringsAsFactors = FALSE, header = TRUE, text ="id   date     returns  
1    Jan-09 -0.07142857 
1    Feb-09 -0.09615385 
1    Mrz-09  0.03273322  
1    Apr-09  0.14896989  
1    May-09  0.06620690  
1    Jun-09 -0.01811125 
1    Jul-09 -0.07142857 
1    Aug-09 -0.09615385 
1    Sep-09  0.03273322  
1    Oct-09  0.14896989  
1    Nov-09  0.06620690  
1    Dez-09  -0.01811125 
2    Aug-09 -0.09615385 
2    Sep-09  0.03273322  
2    Oct-09  0.14896989  
2    Nov-09  0.06620690  
2    Dez-09 -0.01811125")

library(data.table)
setDT(df)[, .(RETan = prod(1+returns)^(1/.N)), by = id]

#returns
   id    RETan
1:  1 1.006974
2:  2 1.023432

Of course, I'm not obtaining the same format as you, to do that you can try:

setDT(df)[, .(date = date, RETan = c(rep(NA,.N-1),prod(1+returns)^(1/.N))), by = id]

#returns
    id   date    RETan
 1:  1 Jan-09       NA
 2:  1 Feb-09       NA
 3:  1 Mrz-09       NA
 4:  1 Apr-09       NA
 5:  1 May-09       NA
 6:  1 Jun-09       NA
 7:  1 Jul-09       NA
 8:  1 Aug-09       NA
 9:  1 Sep-09       NA
10:  1 Oct-09       NA
11:  1 Nov-09       NA
12:  1 Dez-09 1.006974
13:  2 Aug-09       NA
14:  2 Sep-09       NA
15:  2 Oct-09       NA
16:  2 Nov-09       NA
17:  2 Dez-09 1.023432

Upvotes: 1

akrun
akrun

Reputation: 887691

We can do a group by operation

library(tidyverse)
library(zoo)
df1 %>% 
 group_by(id, year = year(as.yearmon(date, format = "%b %y"))) %>% 
 mutate(RETan =prod(1+returns)^(1/n()),
        RETan = replace(RETan, row_number() < n(), NA_real_))

Upvotes: 1

Related Questions