Reputation: 11
The question I'm trying to answer is as follows: "A box contains 15 red marbles and 5 blue marbles. 10 marbles are drawn without replacement. What is the chance at least 2 of them will be red?" What functions in R would help me solve this? I don't need an answer to the question itself, rather a function in R that will help me solve it!
Upvotes: 0
Views: 315
Reputation: 3923
You don't have to simuate you can calculate exactly with a few r
functions.
library(DescTools)
# build a vector with the right marbles
marbles <- c(rep("Red", 15), rep("Blue", 5))
# Our denominator is the total possible ways we
# can draw ten marbles no replacement orde doesn't matter
# CombN from DescTools gives us that number
CombN(marbles, 10, repl = FALSE, ord = FALSE)
#> [1] 184756
# CombSet generates all the permutations
tail(CombSet(marbles, 10, repl = FALSE, ord = FALSE))
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [184751,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
#> [184752,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
#> [184753,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
#> [184754,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
#> [184755,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
#> [184756,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
# We'll put all those combinations in a matrix called x
x <- CombSet(marbles, 10)
# We can search for all rows that have "Red" at least twice
# with x[rowSums(x == "Red") >= 2, ]
tail(x[rowSums(x == "Red") >= 2, ])
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [184751,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
#> [184752,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
#> [184753,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
#> [184754,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
#> [184755,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
#> [184756,] "Red" "Red" "Red" "Red" "Red" "Blue" "Blue" "Blue" "Blue" "Blue"
# So the total possibilities are
nrow(x)
#> [1] 184756
# And the number that meet the condition is
nrow(x[rowSums(x == "Red") >= 2, ])
#> [1] 184756
# to calulate the probability
# we put nrow(x[rowSums(x == "Red") >= 2, ]) over
# nrow(x)
nrow(x[rowSums(x == "Red") >= 2, ]) / nrow(x)
#> [1] 1
# The probability is 1
### a more reasonable game
# same marbles
marbles <- c(rep("Red", 15), rep("Blue", 5))
# Draw 5 without replacement what is the probability
# we'll draw exactly two blue
totalpossiblepicks <- CombN(marbles, 5, repl = FALSE, ord = FALSE)
matrixofpicks <- CombSet(marbles, 5)
desiredoutcomes <- nrow(matrixofpicks[rowSums(matrixofpicks == "Blue") == 2, ])
# calculate probability
desiredoutcomes / totalpossiblepicks
#> [1] 0.2934727
Upvotes: 1
Reputation: 39154
I have a code to run simulation to calculate the probability. However, as I mentioned in the comment. The example you gave will always lead to 1, which is not exciting.
# Generate a vector containing 15 red marbles and 5 blue marbles
box <- c(rep("red", 15), rep("blue", 5))
# set seed for reproducibility
set.seed(124)
# Size of sampling
N <- 10
# A function to run simulation, taking N marbels without replacement
sim_fun <- function(x) sample(box, size = N)
# Number of simulation
S <- 10000
# Run the simulation S times
sims <- replicate(S, expr = sim_fun(), simplify = FALSE)
# Calculate the proportion that at least two marbles are red
mean(sapply(sims, function(x){
sum(x == "red") >= 2
}))
# [1] 1
Upvotes: 1