Reputation: 129
I'm considering a potential policy change that will allow organizations (OrgID
) to spend their existing grant money if they meet certain conditions. The data is three columns, and here's the first six rows:
OrgID Amount Group
1 1 5782457 Group1
2 2 2280221 Group3
3 3 3260741 Group4
4 4 3869420 Group1
5 5 3950578 Group1
6 6 2058883 Group2
This would occur once per year, and the conditions are based on which group the organization is in (Group
), and their current balance (Amount
). Specifically,
For
Group
=Group1
- IfAmount
is $500,000 or less, all the money can be spent
- IfAmount
is greater than $500,000, 50% of the money can be spent.For
Group
=Group2
- ifAmount
is $300,000 or less, all the money can be spent
- IfAmount
is greater than $300,000, 30% of the money can be spent.For
Group
=Group3
- IfAmount
is $100,000 or less, all the money can be spent
- IfAmount
is greater than $100,000, 10% of the money can be spent.For
Group
=Group4
- No money can be spent under any conditions.
I wanted to know the total money remaining after each year for the next five years, so I turned to the dplyr package and wrote the following:
mydata <-
mydata %>%
mutate(ReleaseOne =
ifelse(Group == "Group1",
ifelse(Amount <= 500000, Amount,
round(Amount*0.50, 2)),
ifelse(Group == "Group2",
ifelse(Amount <= 300000, Amount,
round(Amount*0.30, 2)),
ifelse(Group == "Group3",
ifelse(Amount <= 100000, Amount,
round(Amount*0.10, 2)), 0)))) %>%
mutate(RemainOne =
Amount - ReleaseOne)
...
mydata <-
mydata %>%
mutate(ReleaseFive =
ifelse(Group == "Group1",
ifelse(RemainFour <= 500000, RemainFour,
round((RemainFour)*0.50, 2)),
ifelse(Group == "Group2",
ifelse(RemainFour <= 300000, RemainFour,
round((RemainFour)*0.30, 2)),
ifelse(Group == "Group3",
ifelse(RemainFour <= 100000, RemainFour,
round((RemainFour)*0.10, 2)), 0)))) %>%
mutate(RemainFive =
RemainFour - ReleaseFive)
Hence, I just repeated the same block of code five times, but each time I changed the names of the variables that begin with "Release" and "Remain" (i.e. RemaimOne
to RemainTwo
, ReleaseOne
to ReleaseTwo
, etc.).
Doing it this way is fine but it got pretty messy. Is there a way to simplify this with custom functions, possibly including for
and while
loops, for example?
Also, it would be valuable to know how how many years it would be until all organizations in groups 1, 2, and 3 reached Amount = 0
; but, the only way I know how to do it is keep repeating what's above until the amounts reach zero.
The data is named mydata.txt and can be found on GitHub at this link.
Upvotes: 0
Views: 216
Reputation: 145775
Here's a little function that should do the trick - I made it slightly more general as well. It doesn't round, but I'm sure you can edit that in if you'd like.
extrap = function(data,
threshhold = c(5e5, 3e5, 1e5, 0),
below = c(1, 1, 1, 1),
above = c(.5, .3, .1, 0),
n = 4) {
res = list()
x = data$Amount
g = as.numeric(data$Group)
for (i in 1:n) {
x = x * above[g] ^ (x > threshhold[g]) * below[g] ^ (x <= threshhold[g])
res[[i]] = x
}
names(res) = paste0("Release_", 1:n)
return(bind_cols(data, res))
}
Running it on the head
of the data you shared in the question:
extrap(dd)
# OrgID Amount Group Release_1 Release_2 Release_3 Release_4
# 1 1 5782457 Group1 2891228.5 1445614.25 722807.12 361403.56
# 2 2 2280221 Group3 228022.1 22802.21 22802.21 22802.21
# 3 3 3260741 Group4 0.0 0.00 0.00 0.00
# 4 4 3869420 Group1 1934710.0 967355.00 483677.50 483677.50
# 5 5 3950578 Group1 1975289.0 987644.50 493822.25 493822.25
# 6 6 2058883 Group2 617664.9 185299.47 185299.47 185299.47
It relies on Group
being a factor
in the data frame, and that the threshhold
, below
, and above
input vectors correspond to the levels of the Group
factor. To make it more general I added a below
vector, which in your case always 0? I'm a little confused by Group4
, maybe the above
value should be 1
? I'll leave the details to you.
Upvotes: 0
Reputation: 611
You can create a separate data frame to compare with-
grp_data <- data.frame("Group" = c("Group1", "Group2", "Group3", "Group4"),
"threshold" = c(500000,300000,100000,0),
"percent" = c(0.5, 0.3, 0.1, 0))
mydata$allowed <- sapply(seq(nrow(mydata)), function(x)
{
ifelse(mydata[x, "Amount"] >=
grp_data[grp_data$Group == mydata[x, "Group"], "threshold"],
grp_data[grp_data$Group == mydata[x, "Group"], "percent"] * mydata[x, "Amount"],
mydata[x, "Amount"])
})
Upvotes: 1