How do I replace a for- loop in R with vector functions in dataframe calculations?

I have been trying to avoid the use of for-loops in R in order to speed up calculations and simplify, relying on vector functions instead where possible. I've succeeded so far, until running into certain amortization calculations. I hit a brick wall and had to resort to a for-loop, see MWE code below. It works, ties out fine, but I'd like to replace it with vector or other more efficient functions. Can someone please help me replace the below with vector functions?

In the full code from which this MWE is extracted, it is reactive using Shiny. The periods and vector rates, actually all variables, change drastically depending on user inputs. The MWE example inputs variables are simplified.

In any case, the below is a very awkward, chainsaw approach and needs to be slimmed down. But I don't know how, having approached this from a complete XLS mindset where I have the most experience. If a for-loop is the only viable option for these sorts of calculations, I welcome any suggestions for improving the below MWE.

At the very bottom is code for a flawed attempt to "vectorize" but results are inaccurate when the vector variables change over periods. I show one of the problems with this vectorized approach in the image at the bottom where ending/beginning balances don't match when moving from one period to the next (the for-loop MWE code doesn't have those problems - it's functional but super clumsy).

For-loop MWE code:

periods        <- 10
beginBal       <- 1000
yield_vector   <- c(0.30,0.30,0.30,0.30,0.30,0.28,0.26,0.20,0.18,0.20)
npr_vector     <- c(0.30,0.30,0.30,0.30,0.30,0.30,0.30,0.30,0.30,0.30)
mpr_vector     <- c(0.20,0.20,0.20,0.20,0.20,0.20,0.20,0.20,0.20,0.20)
default_vector <- c(0.10,0.10,0.10,0.10,0.10,0.09,0.08,0.07,0.06,0.05)

amort <- data.frame(period=seq(1,periods,1),
                    beginBal=rep(NA,periods),
                    yield=rep(NA,periods,),
                    purchases=rep(NA,periods),
                    payments=rep(NA,periods),
                    defaults=rep(NA,periods),
                    endBal=rep(NA,periods))

# Completes first row of data frame
amort[1,2] <- beginBal
amort[1,3] <- beginBal * yield_vector[1]/12
amort[1,4] <- beginBal * npr_vector[1]
amort[1,5] <- beginBal * mpr_vector[1]
amort[1,6] <- beginBal * default_vector[1] / 12
amort[1,7] <- beginBal + amort[1,4] - amort[1,5] - amort[1,6]

# Completes remaining rows of data frame
for(i in 2:nrow(amort)){
amort[i,2] <- amort[i-1,7]
amort[i,3] <- amort[i,2] * yield_vector[i]/12
amort[i,4] <- amort[i,2] * npr_vector[i]
amort[i,5] <- amort[i,2] * mpr_vector[i]
amort[i,6] <- amort[i,2] * default_vector[i]/12
amort[i,7] <- amort[i,2] + amort[i,4] - amort[i,5] - amort[i,6]
}
amort

And here's that sleek-looking but flawed attempt to vectorize, see one of its output flaws in the below image (these problems don't arise in the above for-loop MWE):

amort           <- data.frame(period=seq(1,periods,1))
amort$beginBal  <- beginBal*(1-(mpr_vector[]+default_vector[]/12-npr_vector[]))^(amort$period-1)
amort$yield     <- amort$beginBal*yield_vector[]/12
amort$purchases <- amort$beginBal*npr_vector[]
amort$payments  <- amort$beginBal*mpr_vector[]
amort$defaults  <- amort$beginBal*default_vector[]/12
amort$endBal    <- amort$beginBal+amort$purchases-amort$payments-amort$defaults

amort <- cbind(amort,yield_vector,npr_vector,mpr_vector,default_vector)
amort

enter image description here

Upvotes: 2

Views: 704

Answers (3)

AnilGoyal
AnilGoyal

Reputation: 26238

Without using Reduce in baseR, I would have done it like this

Explanation -

  • for every row you are actually creating an Endbal by multiplying beginBal of that row by 1 + npr_vector - mpr_vector - default_vector/12
  • So I created a dummy/anonymous vector by appending 1 to its start and having cumulative product of it. Like cumprod(c(1, 1 + npr_vector - mpr_vector - default_vector/12)
  • thereafter clipped its last element by using [-(periods + 1)]
  • thereafter multiplying it by beginBal initial value. That will give you beginBal values for every period
  • Mutating rest of the columns are pretty simple.
  • If you require any further explanation, please feel free to ask.
#given data

periods        <- 10
beginBal       <- 1000
yield_vector   <- c(0.30,0.30,0.30,0.30,0.30,0.28,0.26,0.20,0.18,0.20)
npr_vector     <- c(0.30,0.30,0.30,0.30,0.30,0.30,0.30,0.30,0.30,0.30)
mpr_vector     <- c(0.20,0.20,0.20,0.20,0.20,0.20,0.20,0.20,0.20,0.20)
default_vector <- c(0.10,0.10,0.10,0.10,0.10,0.09,0.08,0.07,0.06,0.05)

amort <- data.frame(Period = seq(periods),
                    beginBal = beginBal * cumprod(c(1, 1 + npr_vector - mpr_vector - default_vector/12)[-(periods + 1)]))

amort <- transform(amort, Yeild = beginBal * yield_vector/12,
                    Purchases = beginBal * npr_vector,
                    Payments = beginBal * mpr_vector,
                    defaults = beginBal * default_vector/12,
                    EndBal = beginBal * (1 + npr_vector - mpr_vector - default_vector/12))

amort
#>    Period beginBal    Yeild Purchases Payments  defaults   EndBal
#> 1       1 1000.000 25.00000  300.0000 200.0000  8.333333 1091.667
#> 2       2 1091.667 27.29167  327.5000 218.3333  9.097222 1191.736
#> 3       3 1191.736 29.79340  357.5208 238.3472  9.931134 1300.979
#> 4       4 1300.979 32.52446  390.2936 260.1957 10.841488 1420.235
#> 5       5 1420.235 35.50587  426.0705 284.0470 11.835291 1550.423
#> 6       6 1550.423 36.17654  465.1269 310.0846 11.628174 1693.837
#> 7       7 1693.837 36.69981  508.1512 338.7675 11.292249 1851.929
#> 8       8 1851.929 30.86548  555.5786 370.3858 10.802918 2026.319
#> 9       9 2026.319 30.39478  607.8956 405.2637 10.131594 2218.819
#> 10     10 2218.819 36.98032  665.6457 443.7638  9.245079 2431.456

Created on 2021-07-16 by the reprex package (v2.0.0)


In dplyr only it would be

library(dplyr, warn.conflicts = F)

#amortisation

seq(periods) %>%
  as.data.frame() %>%
  setNames('Period') %>%
  mutate(beginBal = beginBal * cumprod(c(1, 1 + npr_vector - mpr_vector - default_vector/12)[-(periods + 1)]),
         Yeild = beginBal * yield_vector/12,
         Purchases = beginBal * npr_vector,
         Payments = beginBal * mpr_vector,
         defaults = beginBal * default_vector/12,
         EndBal = beginBal * (1 + npr_vector - mpr_vector - default_vector/12))

#>    Period beginBal    Yeild Purchases Payments  defaults   EndBal
#> 1       1 1000.000 25.00000  300.0000 200.0000  8.333333 1091.667
#> 2       2 1091.667 27.29167  327.5000 218.3333  9.097222 1191.736
#> 3       3 1191.736 29.79340  357.5208 238.3472  9.931134 1300.979
#> 4       4 1300.979 32.52446  390.2936 260.1957 10.841488 1420.235
#> 5       5 1420.235 35.50587  426.0705 284.0470 11.835291 1550.423
#> 6       6 1550.423 36.17654  465.1269 310.0846 11.628174 1693.837
#> 7       7 1693.837 36.69981  508.1512 338.7675 11.292249 1851.929
#> 8       8 1851.929 30.86548  555.5786 370.3858 10.802918 2026.319
#> 9       9 2026.319 30.39478  607.8956 405.2637 10.131594 2218.819
#> 10     10 2218.819 36.98032  665.6457 443.7638  9.245079 2431.456

However, in purrr::accumulate the syntax would be

library(tidyverse)

# amortisation

seq(periods) %>%
  as.data.frame() %>%
  setNames('Period') %>%
  mutate(beginBal = accumulate(1 + npr_vector - mpr_vector - default_vector/12, .init = beginBal,
                               ~ .x * .y)[-(n() + 1)],
         Yeild = beginBal * yield_vector/12,
         Purchases = beginBal * npr_vector,
         Payments = beginBal * mpr_vector,
         defaults = beginBal * default_vector/12,
         EndBal = beginBal * (1 + npr_vector - mpr_vector - default_vector/12))

#>    Period beginBal    Yeild Purchases Payments  defaults   EndBal
#> 1       1 1000.000 25.00000  300.0000 200.0000  8.333333 1091.667
#> 2       2 1091.667 27.29167  327.5000 218.3333  9.097222 1191.736
#> 3       3 1191.736 29.79340  357.5208 238.3472  9.931134 1300.979
#> 4       4 1300.979 32.52446  390.2936 260.1957 10.841488 1420.235
#> 5       5 1420.235 35.50587  426.0705 284.0470 11.835291 1550.423
#> 6       6 1550.423 36.17654  465.1269 310.0846 11.628174 1693.837
#> 7       7 1693.837 36.69981  508.1512 338.7675 11.292249 1851.929
#> 8       8 1851.929 30.86548  555.5786 370.3858 10.802918 2026.319
#> 9       9 2026.319 30.39478  607.8956 405.2637 10.131594 2218.819
#> 10     10 2218.819 36.98032  665.6457 443.7638  9.245079 2431.456

Upvotes: 2

Anoushiravan R
Anoushiravan R

Reputation: 21938

This solution can also be used in tidyverse:

library(dplyr)
library(purrr)

data2 <- cbind(period, yield_vector, npr_vector, mpr_vector, default_vector)

data2 %>%
  nest_by(period) %>%
  ungroup() %>%
  mutate(beginBal = accumulate(data[-1], .init = beginBal,
                             ~ .x + 
                               (.x * .y$npr_vector) - 
                               (.x * .y$mpr_vector) - 
                               (.x * .y$default_vector / 12))) %>%
  unnest(data) %>%
  mutate(yield = beginBal * yield_vector/12,
         purchases = beginBal * npr_vector, 
         payments = beginBal * mpr_vector,
         defaults = beginBal * default_vector / 12,
         endBal = beginBal + purchases - payments - defaults) %>%
  select(!contains("vector"))

Output

# A tibble: 10 x 7
   period beginBal yield purchases payments defaults endBal
    <dbl>    <dbl> <dbl>     <dbl>    <dbl>    <dbl>  <dbl>
 1      1    1000   25        300      200      8.33  1092.
 2      2    1092.  27.3      328.     218.     9.10  1192.
 3      3    1192.  29.8      358.     238.     9.93  1301.
 4      4    1301.  32.5      390.     260.    10.8   1420.
 5      5    1420.  35.5      426.     284.    11.8   1550.
 6      6    1552.  36.2      465.     310.    11.6   1695.
 7      7    1696.  36.8      509.     339.    11.3   1855.
 8      8    1856.  30.9      557.     371.    10.8   2031.
 9      9    2033.  30.5      610.     407.    10.2   2226.
10     10    2227.  37.1      668.     445.     9.28  2441.

Upvotes: 1

Onyambu
Onyambu

Reputation: 79318

You could do:

f <- function(x, y){
  x  * (1 + npr_vector[y] - mpr_vector[y] -  default_vector[y] / 12)
}

res <- Reduce(f, seq(periods), init = beginBal, accumulate = TRUE)
b <- head(res, -1)

result <- data.frame(period = seq(periods), beginBal = b,  yield = b * yield_vector/ 12,
           purchases = b * npr_vector,  payments = b * mpr_vector, 
           defaults = b * default_vector/12,   endBal = res[-1])

to check:

result
   period beginBal    yield purchases payments  defaults   endBal
1       1 1000.000 25.00000  300.0000 200.0000  8.333333 1091.667
2       2 1091.667 27.29167  327.5000 218.3333  9.097222 1191.736
3       3 1191.736 29.79340  357.5208 238.3472  9.931134 1300.979
4       4 1300.979 32.52446  390.2936 260.1957 10.841488 1420.235
5       5 1420.235 35.50587  426.0705 284.0470 11.835291 1550.423
6       6 1550.423 36.17654  465.1269 310.0846 11.628174 1693.837
7       7 1693.837 36.69981  508.1512 338.7675 11.292249 1851.929
8       8 1851.929 30.86548  555.5786 370.3858 10.802918 2026.319
9       9 2026.319 30.39478  607.8956 405.2637 10.131594 2218.819
10     10 2218.819 36.98032  665.6457 443.7638  9.245079 2431.456
 

all.equal(result, amort)
[1] TRUE

Upvotes: 3

Related Questions