The Pointer
The Pointer

Reputation: 2386

Calculating probabilities of simulated random variables in R

I have the following graph:

enter image description here

I need to travel from A to B. I also assume that I am taking the fastest route from A to be every day.

The travel times (in hours) between the nodes are exponentially distributed. I have simulated them, with the relevant lambda values, in R as follows:

AtoX <- rexp(1000, 4)
AtoY <- rexp(1000, 2.5)
XtoY <- rexp(1000, 10)
YtoX <- rexp(1000, 10) 
XtoB <- rexp(1000, 3)
YtoB <- rexp(1000, 5)

I calculated the average travel time everyday in R as follows:

AXB <- AtoX + XtoB
AYB <- AtoY + YtoB
AXYB <- AtoX + XtoY + YtoB
AYXB <- AtoY + YtoX + XtoB 

TravelTimes <- pmin(AXB, AYB, AXYB, AYXB)
averageTravelTime <- mean(TravelTimes)

I'm now trying to find the following for every single day:

  1. With which probability is each of the four possible routes from A to B taken?

  2. What is the probability that I have to travel more than half an hour?

For (1), I understand that I need to take the cumulative distribution function (CDF) P(x <= X) for each route.

For (2), I understand that I need to take the cumulative distribution function (CDF) P(0.5 => X), where 0.5 denotes half an hour.

I have only just started learning R, and I am unsure of how to go about doing this.

Reading the documentation, it seem that I might need to do something like the following to calculate the CDF:

pexp()

1 - pexp()

How can I do this?

Upvotes: 1

Views: 417

Answers (1)

Julius Vainora
Julius Vainora

Reputation: 48211

Let R1, R2, R3, R4 be, in some order, random variables corresponding to the total time of the four routes. Then, being sums of independent exponential random variables, each of them follows the Erlang or the Gamma distribution (see here).

To answer 1, you want to find P(min{R1, R2, R3, R4} = R_i) for i=1,2,3,4. While the minimum of independent exponential random variables is tractable (see here), as far as I know that is not the case with Erlang/Gamma distributions in general. Hence, I believe you need to answer this question numerically, using simulations.

The same applies to the second question requiring to find P(min{R1, R2, R3, R4} >= 1/2).

Hence, we have

table(apply(cbind(AXB, AYB, AXYB, AYXB), 1, which.min)) / 1000 
#     1     2     3     4 
# 0.312 0.348 0.264 0.076 

and

mean(TravelTimes >= 0.5)
# [1] 0.145

as our estimates. By increasing 1000 to some higher number (e.g., 1e6 works fast) one could make those estimates more precise.

Upvotes: 2

Related Questions