Reputation: 329
I have an issue that I'm completely stumped on where/how to start. I'm given a dataset which looks like the picture below.
In the picture the column Current is the starting dollar balance, Paid is the monthly amount that was paid and is always a fixed amount, %ofCurrentThatisExtra is a predicted % of the Current that will be paid in addition to the required $10 that is due on each row, ExtraPayment is the Current multiplied by the %ofCurrentThatisExtra, Outstanding is the Current minus Paid minus ExtraPayment.
I'm given the first row's starting Current (100). What I need to do is calculate the ExtraPayment and Outstanding. Again, if you look at the picture below, the Current is equal to the lag Outstanding.
What I need to do is write a script in R that will do this. But what has me stumped is how to properly write a function which will first calculate ExtraPayment then the Outstanding followed by grabbing the lagged Outstanding and recalculating these columns for the next row. It seems like I need two if_else
statements calculating two different columns at the same time and I'm not sure if thats possible in R. Does anyone have any ideas?
Upvotes: 1
Views: 135
Reputation: 1624
You can also use a datastep, which basically works the same as a for loop, but is a little cleaner.
Data:
dat <- tibble(
current = c(100, rep(NA, 6)),
paid = 10,
pct_extra = c(.02,.05,.05,.07, .03, .01, .09)
)
Calculations:
library(libr)
dat2 <- datastep(dat,
retain = list(Outstanding = 0),
{
# Assign current value
if (n. > 1)
current <- Outstanding
# Calculate extra payment
ExtraPayment <- current * pct_extra
# Calculate outstanding balance
Outstanding <- current - paid - ExtraPayment
})
dat2
# # A tibble: 7 x 5
# current paid pct_extra Outstanding ExtraPayment
# <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 100 10 0.02 88 2
# 2 88 10 0.05 73.6 4.4
# 3 73.6 10 0.05 59.9 3.68
# 4 59.9 10 0.07 45.7 4.19
# 5 45.7 10 0.03 34.4 1.37
# 6 34.4 10 0.01 24.0 0.344
# 7 24.0 10 0.09 11.8 2.16
Upvotes: 2
Reputation: 21992
Doing this with a for
loop makes the most sense to me. First, here's the data
dat <- tibble(
current = c(100, rep(NA, 6)),
paid = 10,
pct_extra = c(.02,.05,.05,.07, .03, .01, .09),
ExtraPayment = NA,
Outstanding = NA
)
dat
# # A tibble: 7 x 5
# current paid pct_extra ExtraPayment Outstanding
# <dbl> <dbl> <dbl> <lgl> <lgl>
# 1 100 10 0.02 NA NA
# 2 NA 10 0.05 NA NA
# 3 NA 10 0.05 NA NA
# 4 NA 10 0.07 NA NA
# 5 NA 10 0.03 NA NA
# 6 NA 10 0.01 NA NA
# 7 NA 10 0.09 NA NA
Now, the loop that does the work:
for(i in 1:nrow(dat)){
dat$ExtraPayment[i] <- dat$current[i]*dat$pct_extra[i]
dat$Outstanding[i] <- dat$current[i] - dat$paid[i] - dat$ExtraPayment[i]
if(i < nrow(dat)){
dat$current[(i+1)] <- dat$Outstanding[i]
}
}
dat
# # A tibble: 7 x 5
# current paid pct_extra ExtraPayment Outstanding
# <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 100 10 0.02 2 88
# 2 88 10 0.05 4.4 73.6
# 3 73.6 10 0.05 3.68 59.9
# 4 59.9 10 0.07 4.19 45.7
# 5 45.7 10 0.03 1.37 34.4
# 6 34.4 10 0.01 0.344 24.0
# 7 24.0 10 0.09 2.16 11.8
Upvotes: 2