Laurence_jj
Laurence_jj

Reputation: 726

Weighted average each year - R

I am looking to calculate the average of the interest_rate for each year (2006 to 2023). I have a series of interest rate decision dates for the new rate and the date.

For years with multiple interest_rates during a year I would like to weight the average value by the proportion of year that at that particular rate.

Example data:

library(data.table)
df = data.table(date = c("03/08/2006", "09/11/2006", "11/01/2007", "10/05/2007", "05/07/2007", "06/12/2007", "07/02/2008", "10/04/2008", "08/10/2008", "06/11/2008", "04/12/2008", "08/01/2009", "05/02/2009", "05/03/2009", "04/08/2016", "02/11/2017", "02/08/2018", "11/03/2020", "19/03/2020", "16/12/2021", "03/02/2022", "17/03/2022", "05/05/2022", "16/06/2022", "04/08/2022", "22/09/2022", "03/11/2022", "15/12/2022", "02/02/2023", "23/03/2023", "11/05/2023"),
                interest_rate = c(4.7500, 5.0000, 5.2500, 5.5000, 5.7500, 5.5000, 5.2500, 5.0000, 4.5000, 3.0000, 2.0000, 1.5000, 1.0000, 0.5000, 0.2500, 0.5000, 0.7500, 0.2500, 0.1000, 0.2500, 0.5000, 0.7500, 1.0000, 1.2500, 1.7500, 2.2500, 3.0000, 3.5000, 4.0000, 4.2500, 4.500))

The resultant data would look like:

year weighted_interest_rate
2006 4.7
2007 ...
2008 etc

Upvotes: 0

Views: 88

Answers (2)

Laurence_jj
Laurence_jj

Reputation: 726

Inelegant solution:

# add to the dataframe a list of dates for the start of each year
df= rbind(df, data.table(date = as.Date(ISOdate(seq(2005,2023),1,1)), "interest_rate" = NA), fill=T)

# set order
setorder(df, -date)

# set value for the new dates to the current interest rate
df[is.na(interest_rate), interest_rate := df[df[is.na(interest_rate)], on = .(date), interest_rate, roll=Inf]]

# identify the previous shift date was
df[, change_date := shift(date, 1L)]

# get next rate change or end of year
df[, next_change_date_or_year := min(as.Date(ISOdate(year,12,31)), change_date, na.rm=T), date]

# get relative duration within year
df[, relative_duration := as.numeric((next_change_date_or_year-date)/365)]

df[, year := year(date)]

# weighted average per year
yearly_IR = df[!is.na(year), .(weight_interest_rate = weighted.mean(`interest rate`, relative_duration)), year]

Upvotes: 1

Allan Cameron
Allan Cameron

Reputation: 174278

One method would be to create a data frame of the interest rates for each day in the entire series, group by year, then average the interest rate for each date in the year.

An important first step is converting the dates-as-strings to actual dates.

library(tidyverse)

df %>%
  mutate(date = as.Date(date, '%d/%m/%Y'),
         days = c(as.numeric(diff(date)), 0)) %>%
  rowwise() %>%
  reframe(date = seq(date, by = 'day', length.out = days),
          interest_rate = interest_rate,
          year = lubridate::year(date)) %>%
  group_by(year) %>%
  summarize(interest_rate = mean(interest_rate))
#> # A tibble: 18 x 2
#>     year interest_rate
#>    <dbl>         <dbl>
#>  1  2006         4.84 
#>  2  2007         5.51 
#>  3  2008         4.67 
#>  4  2009         0.644
#>  5  2010         0.5  
#>  6  2011         0.5  
#>  7  2012         0.5  
#>  8  2013         0.5  
#>  9  2014         0.5  
#> 10  2015         0.5  
#> 11  2016         0.398
#> 12  2017         0.291
#> 13  2018         0.604
#> 14  2019         0.75 
#> 15  2020         0.228
#> 16  2021         0.107
#> 17  2022         1.47 
#> 18  2023         3.97 

Upvotes: 1

Related Questions