Reputation: 514
I would like to calculate the weighted average life (WAL) of a loan over time in R. The formula to calculate the WAL is given here.
I have the following sample data created in R.
Sample data
library(data.table)
DT<-data.table(date=c(rep(seq(from = 2015, to = 2016.25,by = .25),2),
seq(from = 2015, to = 2017.5,by = .5)),
value=c(rep(100,5), 0, 100, 80, 60, 40, 20, 0, 100, 70, 40, 30, 20, 0),
id=rep(c("a","b","c"),each=6))
DT
date value id
1: 2015.00 100 a
2: 2015.25 100 a
3: 2015.50 100 a
4: 2015.75 100 a
5: 2016.00 100 a
6: 2016.25 0 a
7: 2015.00 100 b
8: 2015.25 80 b
9: 2015.50 60 b
10: 2015.75 40 b
11: 2016.00 20 b
12: 2016.25 0 b
13: 2015.00 100 c
14: 2015.50 70 c
15: 2016.00 40 c
16: 2016.50 30 c
17: 2017.00 20 c
18: 2017.50 0 c
Thus every loan in this example has a maturity of 5 years and at maturity date the loan is completely amortized. Note: The dates are not always incremented by one semi year or one quarter, but are may differ (see sample data).
To calculate the the WAL I have created the following R code
Counter <- unique(DT$id)
# LOOP OVER ID
for (i in 1:length(Counter)) {
# SUBSET ONE ID
DTSub <- DT[id == Counter[i], ]
# LOOP OVER THE AMORTIZATIONDATES
CounterSub <- unique(DTSub$date)
for (j in 1:length(CounterSub)) {
# SUBSET RANGE OF DATES IN COUNTERSUB
DTSub_Date <- DTSub[date >= CounterSub[j], ]
DTSub_Date[, t := abs(min(date)-date)]
DT[id == Counter[i] & date == CounterSub[j],
wal_calc := round(sum(abs(diff(DTSub_Date$value))
/ max(DTSub_Date$value) * DTSub_Date$t[2:nrow(DTSub_Date)]),3)]
}
}
The output of the code
DT
date value id wal_calc
1: 2015.00 100 a 1.250
2: 2015.25 100 a 1.000
3: 2015.50 100 a 0.750
4: 2015.75 100 a 0.500
5: 2016.00 100 a 0.250
6: 2016.25 0 a 0.000
7: 2015.00 100 b 0.750
8: 2015.25 80 b 0.625
9: 2015.50 60 b 0.500
10: 2015.75 40 b 0.375
11: 2016.00 20 b 0.250
12: 2016.25 0 b 0.000
13: 2015.00 100 c 1.300
14: 2015.50 70 c 1.143
15: 2016.00 40 c 1.125
16: 2016.50 30 c 0.833
17: 2017.00 20 c 0.500
18: 2017.50 0 c 0.000
The output of the code is correct (wal_calc
) but uses a double for
loop, and hence is slow on relatively large datasets (mine has 77k rows and 200 columns).
The first for
loop subsets the IDs and the second subsets future dates (by id, based on the first subset).
Request
I would like to be able to generate WALS on this sample data in way faster and more efficient manner and avoid this double for loop. There might be a very simple solution to this problem.
If anything is unclear please let me know.
Upvotes: 4
Views: 799
Reputation: 34733
This will do it without for
loops.
DT[order(date), WAL := {
pmts <- matrix(value[-.N] - value[-1L],
nrow = n2 <- .N - 1L, ncol = n2)
ts <- matrix(date[-1L] - date[-.N], nrow = n2, ncol = n2)
ts[upper.tri(ts)] <- 0
ts <- apply(ts, 2, cumsum)
c(colSums(pmts * ts) / value[-.N], 0)}, by = id]
DT
date value id WAL
# 1: 2015.00 100 a 1.2500000
# 2: 2015.25 100 a 1.0000000
# 3: 2015.50 100 a 0.7500000
# 4: 2015.75 100 a 0.5000000
# 5: 2016.00 100 a 0.2500000
# 6: 2016.25 0 a 0.0000000
# 7: 2015.00 100 b 0.7500000
# 8: 2015.25 80 b 0.6250000
# 9: 2015.50 60 b 0.5000000
# 10: 2015.75 40 b 0.3750000
# 11: 2016.00 20 b 0.2500000
# 12: 2016.25 0 b 0.0000000
# 13: 2015.00 100 c 1.3000000
# 14: 2015.50 70 c 1.1428571
# 15: 2016.00 40 c 1.1250000
# 16: 2016.50 30 c 0.8333333
# 17: 2017.00 20 c 0.5000000
# 18: 2017.50 0 c 0.0000000
Upvotes: 3
Reputation: 1173
you could use apply
instead for the first subset. Then you would just need on for loop.
ids <- unique(DT$id)
DTSub <- apply(DT, 1, function(x) if x$id %in% ids)
CounterSub <- unique(DTSub$date)
Upvotes: 1