a_js12
a_js12

Reputation: 329

"Dynamic" Function to Calculate Multiple Columns Simultaneously in R

I have an issue that I'm completely stumped on where/how to start. I'm given a dataset which looks like the picture below.

In the picture the column Current is the starting dollar balance, Paid is the monthly amount that was paid and is always a fixed amount, %ofCurrentThatisExtra is a predicted % of the Current that will be paid in addition to the required $10 that is due on each row, ExtraPayment is the Current multiplied by the %ofCurrentThatisExtra, Outstanding is the Current minus Paid minus ExtraPayment.

I'm given the first row's starting Current (100). What I need to do is calculate the ExtraPayment and Outstanding. Again, if you look at the picture below, the Current is equal to the lag Outstanding.

What I need to do is write a script in R that will do this. But what has me stumped is how to properly write a function which will first calculate ExtraPayment then the Outstanding followed by grabbing the lagged Outstanding and recalculating these columns for the next row. It seems like I need two if_else statements calculating two different columns at the same time and I'm not sure if thats possible in R. Does anyone have any ideas?

enter image description here

Upvotes: 1

Views: 135

Answers (2)

David J. Bosak
David J. Bosak

Reputation: 1624

You can also use a datastep, which basically works the same as a for loop, but is a little cleaner.

Data:

dat <- tibble(
  current = c(100, rep(NA, 6)), 
  paid = 10, 
  pct_extra = c(.02,.05,.05,.07, .03, .01, .09)
)

Calculations:

library(libr)

dat2 <- datastep(dat, 
                 retain = list(Outstanding = 0),
  {
  
    # Assign current value
    if (n. > 1)
      current <- Outstanding
    
    # Calculate extra payment
    ExtraPayment <- current * pct_extra
    
    # Calculate outstanding balance 
    Outstanding <- current - paid - ExtraPayment

  })

dat2
# # A tibble: 7 x 5
#   current  paid pct_extra Outstanding ExtraPayment
#     <dbl> <dbl>     <dbl>       <dbl>        <dbl>
# 1   100      10      0.02        88          2    
# 2    88      10      0.05        73.6        4.4  
# 3    73.6    10      0.05        59.9        3.68 
# 4    59.9    10      0.07        45.7        4.19 
# 5    45.7    10      0.03        34.4        1.37 
# 6    34.4    10      0.01        24.0        0.344
# 7    24.0    10      0.09        11.8        2.16 

Upvotes: 2

DaveArmstrong
DaveArmstrong

Reputation: 21992

Doing this with a for loop makes the most sense to me. First, here's the data

dat <- tibble(
  current = c(100, rep(NA, 6)), 
  paid = 10, 
  pct_extra = c(.02,.05,.05,.07, .03, .01, .09), 
  ExtraPayment = NA, 
  Outstanding = NA
)

dat
# # A tibble: 7 x 5
#   current  paid pct_extra ExtraPayment Outstanding
#     <dbl> <dbl>     <dbl> <lgl>        <lgl>      
# 1     100    10      0.02 NA           NA         
# 2      NA    10      0.05 NA           NA         
# 3      NA    10      0.05 NA           NA         
# 4      NA    10      0.07 NA           NA         
# 5      NA    10      0.03 NA           NA         
# 6      NA    10      0.01 NA           NA         
# 7      NA    10      0.09 NA           NA         

Now, the loop that does the work:

for(i in 1:nrow(dat)){
  dat$ExtraPayment[i] <- dat$current[i]*dat$pct_extra[i]
  dat$Outstanding[i] <- dat$current[i] - dat$paid[i] - dat$ExtraPayment[i]
  if(i < nrow(dat)){
    dat$current[(i+1)] <- dat$Outstanding[i]
  }
}

dat
# # A tibble: 7 x 5
#   current  paid pct_extra ExtraPayment Outstanding
#     <dbl> <dbl>     <dbl>        <dbl>       <dbl>
# 1   100      10      0.02        2            88  
# 2    88      10      0.05        4.4          73.6
# 3    73.6    10      0.05        3.68         59.9
# 4    59.9    10      0.07        4.19         45.7
# 5    45.7    10      0.03        1.37         34.4
# 6    34.4    10      0.01        0.344        24.0
# 7    24.0    10      0.09        2.16         11.8

Upvotes: 2

Related Questions