Reputation: 726
I am looking to calculate the average of the interest_rate
for each year
(2006 to 2023). I have a series of interest rate decision dates for the new rate and the date.
For years with multiple interest_rates
during a year I would like to weight the average value by the proportion of year that at that particular rate.
Example data:
library(data.table)
df = data.table(date = c("03/08/2006", "09/11/2006", "11/01/2007", "10/05/2007", "05/07/2007", "06/12/2007", "07/02/2008", "10/04/2008", "08/10/2008", "06/11/2008", "04/12/2008", "08/01/2009", "05/02/2009", "05/03/2009", "04/08/2016", "02/11/2017", "02/08/2018", "11/03/2020", "19/03/2020", "16/12/2021", "03/02/2022", "17/03/2022", "05/05/2022", "16/06/2022", "04/08/2022", "22/09/2022", "03/11/2022", "15/12/2022", "02/02/2023", "23/03/2023", "11/05/2023"),
interest_rate = c(4.7500, 5.0000, 5.2500, 5.5000, 5.7500, 5.5000, 5.2500, 5.0000, 4.5000, 3.0000, 2.0000, 1.5000, 1.0000, 0.5000, 0.2500, 0.5000, 0.7500, 0.2500, 0.1000, 0.2500, 0.5000, 0.7500, 1.0000, 1.2500, 1.7500, 2.2500, 3.0000, 3.5000, 4.0000, 4.2500, 4.500))
The resultant data would look like:
year | weighted_interest_rate |
---|---|
2006 | 4.7 |
2007 | ... |
2008 | etc |
Upvotes: 0
Views: 88
Reputation: 726
Inelegant solution:
# add to the dataframe a list of dates for the start of each year
df= rbind(df, data.table(date = as.Date(ISOdate(seq(2005,2023),1,1)), "interest_rate" = NA), fill=T)
# set order
setorder(df, -date)
# set value for the new dates to the current interest rate
df[is.na(interest_rate), interest_rate := df[df[is.na(interest_rate)], on = .(date), interest_rate, roll=Inf]]
# identify the previous shift date was
df[, change_date := shift(date, 1L)]
# get next rate change or end of year
df[, next_change_date_or_year := min(as.Date(ISOdate(year,12,31)), change_date, na.rm=T), date]
# get relative duration within year
df[, relative_duration := as.numeric((next_change_date_or_year-date)/365)]
df[, year := year(date)]
# weighted average per year
yearly_IR = df[!is.na(year), .(weight_interest_rate = weighted.mean(`interest rate`, relative_duration)), year]
Upvotes: 1
Reputation: 174278
One method would be to create a data frame of the interest rates for each day in the entire series, group by year, then average the interest rate for each date in the year.
An important first step is converting the dates-as-strings to actual dates.
library(tidyverse)
df %>%
mutate(date = as.Date(date, '%d/%m/%Y'),
days = c(as.numeric(diff(date)), 0)) %>%
rowwise() %>%
reframe(date = seq(date, by = 'day', length.out = days),
interest_rate = interest_rate,
year = lubridate::year(date)) %>%
group_by(year) %>%
summarize(interest_rate = mean(interest_rate))
#> # A tibble: 18 x 2
#> year interest_rate
#> <dbl> <dbl>
#> 1 2006 4.84
#> 2 2007 5.51
#> 3 2008 4.67
#> 4 2009 0.644
#> 5 2010 0.5
#> 6 2011 0.5
#> 7 2012 0.5
#> 8 2013 0.5
#> 9 2014 0.5
#> 10 2015 0.5
#> 11 2016 0.398
#> 12 2017 0.291
#> 13 2018 0.604
#> 14 2019 0.75
#> 15 2020 0.228
#> 16 2021 0.107
#> 17 2022 1.47
#> 18 2023 3.97
Upvotes: 1