Md. Fahd Al Georgy
Md. Fahd Al Georgy

Reputation: 23

How to find mean with binomial random variable in R?

lets say we run:

mean(rbinom(100, 42, 0.76)) and get the mean and then we run mean(rbinom(1000, 42, 0.76)) and get the mean.

now if we calculate following formula n*p then in both case it would be 42 * 0.76 right? cause n will be 42? in both cases? then what is the impact of having 100, and 1000 samples?

Please help!!

Upvotes: 2

Views: 2972

Answers (1)

Adam Sampson
Adam Sampson

Reputation: 2021

This question better belongs on Cross Validated.

If you take a sample of the binomial distribution the mean of that sample will not (often) be 42 * 0.76. Instead, "On Average" the mean of the samples will be 42 * 0.76.

The reason that the number of samples matters is because you are dealing with a small sample of the population. Take it to the extreme to see how this would work.

Sample size n = 1.

  1. If you draw a 42 then the mean of the sample will be 42.
  2. If you draw a 32 then the mean of the sample will be 32
  3. If you draw a 25 then the mean of the sample will be 25.
  4. If you draw MANY samples the mean of the means will be approximately 31.9 (the mean of the population).

A larger sample size is less likely to be dominated by outliers and more likely to be close to the population mean of 31.9.

You can visualize this in r pretty easy using the following code:

n_samp <- 1
hist(rbinom(n_samp,42,0.76),breaks = seq(0,42),xlim = c(0,42))
n_samp <- 1000
hist(rbinom(n_samp,42,0.76),breaks = seq(0,42),xlim = c(0,42))

binomial random with n = 1

binomial random with n = 1000

Upvotes: 2

Related Questions