Reputation: 10604
I'm trying to visualize health insurance benefit options for my company to help others make a decision. I have a table like so:
| plan | ded | oop | exp_oop |
|------+------------+------+---------|
| a | 400 | 2100 | 17400 |
| b | 1300 | 2600 | 14300 |
| c | 2600 | 5200 | 28600 |
ded
= deductible; expense level where 90% co-insurance kicks inoop
= out of pocket maximumexp_oop
= amount of medical expense at which oop
is reachedI want to plot cost to the employee vs. medical expenses incurred. Health insurance works in ranges...
cost = expenses for 0 < expenses < ded
cost = deductible + (0.10 x (expenses - ded)) for ded <= expenses < exp_oop
cost = oop for oop <= expenses <= infinity
How might I plot each of these ranges? Basically, one gets a line of slope = 1 for 0 to each plan's deductible, then a line of slope = 0.1 from x = deductible to x = oop, and then a line of slope = 0 from oop upward.
I'm not sure how to conditionally plot with ggplot2. If you'd like to use the above, here's reproducible code for these cutoffs:
dat <- data.frame(plan = c("a", "b", "c"), ded = c(400, 1300, 2600), oop = c(2100, 2600, 5200), exp_oop = c(17400, 14300, 28600))
Do I have to create the x/y values myself? In other words an intermediate table like so?
| plan | x | y |
|------+-------+------|
| 1 | 0 | 0 |
| 1 | 400 | 400 |
| 1 | 17400 | 2100 |
| 2 | 0 | 0 |
| 2 | 1300 | 1300 |
| 2 | 14300 | 2600 |
| 3 | 0 | 0 |
| 3 | 2600 | 2600 |
| 3 | 28600 | 5200 |
I'm doing this for several variants (employee only, employee + spouse, etc.) so it would be great if I didn't need separate data tables for each plan but could just work with the already defined deductibles and out of pocket max values I already have in a data frame...
Thanks for any suggestions!
Upvotes: 1
Views: 235
Reputation: 58845
My approach basically follows Drew's, but just does the steps differently. I start with a function which takes the plan
, ded
, oop
, and exp_oop
and returns a function which gives a cost for a given expense (based on those parameters). [Note: I've assumed the break between the second and third tier is exp_oop
, not oop
as originally stated in the question.]
cost_generator <- function(ded, oop, exp_oop, ...) {
function(expenses) {
ifelse(expenses < ded,
expenses,
ifelse(expenses < exp_oop,
ded + (0.1 * (expenses-ded)),
oop))
}
}
Now using plyr
, I can create a list of functions which map expenses to cost, one for each plan
library("plyr")
funs <- mlply(dat, cost_generator)
For each function, determine the cost for a given range of expenses. Here, I've picked a range from 0 to $50,000 in increments of $100.
pts <- ldply(funs, function(f) {
expenses <- seq(0, 50000, 100)
data.frame(expenses=expenses, cost=f(expenses))
})
This gives a data frame in long form which is easy to plot.
library("ggplot2")
ggplot(pts, aes(expenses, cost, colour=plan)) +
geom_line()
Of course, this is not really cost, but amount paid out of pocket for a given level of expense. Total cost will include additional things (premiums, at least).
EDIT:
If you want to make sure every change point is included (not relying on rounding to the nearest $100), you can extract the points from dat
and use those:
library("reshape2")
exps <- melt(dat, id.var="plan")$value
exps <- c(0, exps, 1.1*max(exps))
pts <- ldply(funs, function(f) {
data.frame(expenses=exps, cost=f(exps))
})
I added 0 and something larger than the largest value in the table to make the ends reasonable.
Upvotes: 1
Reputation: 16617
Write a vectorize function to calculate costs to the employee as a function of expenses occurred. It must be vectorized, so that you can feed it to ddply
.
costFinder <- function(df, oopActual) {
#df is your 'dat'; we will throw away exp_oop
#oopActual should be a vector; it is the x axis of your plot
ded <- df$ded
oopMax <- df$oop
cost <- rep(NA, length(oopActual)) #preallocating with NAs will help ID mistakes
cost[oopActual<ded] <- oopActual[oopActual<ded]
cost[ded <= oopActual & oopActual < oopMax] <- 0.1 * (oopActual[ded <= oopActual & oopActual < oopMax] - ded) + ded
cost[oopMax <= oopActual] <- oopMax
return(cost)
}
Then define an expense
seqence (not too many data points, or it becomes computationally expensive) and calculate the actual out-of-pocket cost foe each value of expense, for each plan:
expense <- seq(0, 50000, by=200)
allCosts <- ddply(dat, .(plan), costFinder, expense)
names(allCosts)[2:ncol(allCosts)] <- expense
Now melt the vector so you can use it with ggplot
. Here, I employ the shady trick of renaming the columns of the allCosts data frame with numerical values. This is probably a bad idea, and I'd love to see a better way to do it.
costsM <- melt(allCosts, id.vars="plan")
names(costsM)[2:3] <- c("expense", "actualOOP")
#melt() interprets the column names as a factor. We have to turn them back into numeric,
# by turning them into characters first and then numerics.
costsM$expense <- as.character(costsM$expense)
costsM$expense <- as.numeric(costsM$expense)
#Plot the data
p <- ggplot() + geom_line(data=costsM, aes(x=expense, y=actualOOP, colour=plan))
print(p)
#Add vertical lines for the expected OOP, if you like - arguably it makes things more confusing.
p + geom_vline(data=dat, aes(xintercept=exp_oop, colour=plan))
Upvotes: 1