Reputation: 23
lets say we run:
mean(rbinom(100, 42, 0.76)) and get the mean and then we run mean(rbinom(1000, 42, 0.76)) and get the mean.
now if we calculate following formula n*p then in both case it would be 42 * 0.76 right? cause n will be 42? in both cases? then what is the impact of having 100, and 1000 samples?
Please help!!
Upvotes: 2
Views: 2972
Reputation: 2021
This question better belongs on Cross Validated.
If you take a sample of the binomial distribution the mean of that sample will not (often) be 42 * 0.76. Instead, "On Average" the mean of the samples will be 42 * 0.76.
The reason that the number of samples matters is because you are dealing with a small sample of the population. Take it to the extreme to see how this would work.
Sample size n = 1.
A larger sample size is less likely to be dominated by outliers and more likely to be close to the population mean of 31.9.
You can visualize this in r pretty easy using the following code:
n_samp <- 1
hist(rbinom(n_samp,42,0.76),breaks = seq(0,42),xlim = c(0,42))
n_samp <- 1000
hist(rbinom(n_samp,42,0.76),breaks = seq(0,42),xlim = c(0,42))
Upvotes: 2