Lynn
Lynn

Reputation: 1

Create Simulation in R

I have following problem:

A casualty insurance company has 1000 policyholders, each of whom will independently present a claim in the next month with probability 5%. Assuming that the amounts of the claims made are independent exponential random variables with mean 800 Dollars.

Does anyone know how to create simulation in R to estimate the probability that the sum of those claims exceeds 50,000 Dollars?

Upvotes: 0

Views: 518

Answers (2)

Andrew Le
Andrew Le

Reputation: 1

I'm a bit inexperienced with R, but here is my solution.

First, construct a function which simulates a single trial. To do so, one needs to determine how many claims are filed n. I hope it is clear that n ~ Binomial(1000, 0.05). Note that, you cannot simply assume n = 1000 * 0.05 = 50. By doing so, you would decrease the variance, which will result in a lower probability. I can explain why this is the case if needed. Then, generate and sum n values based on an exponential distribution with mean 800.

simulate_total_claims <- function(){
claim_amounts <- rexp(rbinom(n=1,size=1000, prob = 0.05), rate = 1/800)
total <- sum(claim_amounts)
return(total)
}

Now, all that needs to be done is run the above function a lot and determine the proportion of runs which have values greater than 50000.

totals <- rerun(.n = 100000, simulate_total_claims())
estimated_prob <- mean(unlist(totals) > 50000)

Upvotes: 0

Rory S
Rory S

Reputation: 1298

This sounds like a homework assignment, so it's probably best to consult with your teacher(s) if you're unsure about how to approach this. Bearing that in mind, here's how I'd go about simulating this:

First, create a function that generates values from an exponential distribution and sums those values, based on the values you give in your problem description.

get_sum_claims <- function(n_policies, prob_claim, mean_claim) {
  sum(rexp(n = n_policies*prob_claim, rate = 1/mean_claim))
} 

Next, make this function return the sum of all claims lots of times, and store the results. The line with map_dbl does this, essentially instructing R to return 100000 simulated sums of claims from the get_sum_claims function.

library(tidyverse)

claim_sums <- map_dbl(1:100000, ~ get_sum_claims(1000, 0.05, 800))

Finally, we can calculate the probability that the sum of claims is greater than 50000 by using the code below:

sum(claim_sums > 50000)/length(claim_sums)

This gives a fairly reliable estimate of ~ 0.046 as the probability that the sum of claims exceeds 50000 in a given month.

Upvotes: 1

Related Questions