Alexandru
Alexandru

Reputation: 25820

probability and relative frequency

If I use relative frequency to estimate the probability of an event, how good is my estimate based on the number of experiments? Is standard deviation a good measure? A paper/link/online book would be perfect.

http://en.wikipedia.org/wiki/Frequentist

Upvotes: 1

Views: 903

Answers (3)

stephan
stephan

Reputation: 10265

You count the number of successes s in a sequence n of Yes / No experiments, right? As long as the single experiments are independent you are in the realm of the Binomial distribution (Wikipedia). Frequency of success f = s / n is an estimator of the success probability p and. The variance of your frequency estimate f is p * (1-p) / n for n draws.

As long as p is not too close to zero or 1, and as long as you do not have "too small" a number of observations n, the standard deviation will be a reasonable measure for the quality of your estimate f.

If n is large enough (rule of thumb n * p > 10), you can approximate by a normal distribution N(f, f * (1-f) / n), and standard deviation estimate is a good measure. See here for a more extensive discussion.

This said the approximation with the standard deviation will not cut any ice if this needs to have some academic rigour (e.g. is a homework).

Upvotes: 0

Mark Lavin
Mark Lavin

Reputation: 25164

I believe you are looking for the confidence interval for a sample proportion. Here are some resources that might be helpful:

Confidence Interval for Proportion Tutorial
Confidence Interval for Proportion Handout

Basically your estimate improves inverse proportionally to the square root of the number of samples. So if you want to cut your error in half you are going to need four times as many samples.

Upvotes: 4

tom10
tom10

Reputation: 69242

Probably a chi-squared test is what you want. See, for example, the wikipedia page on Pearson's chi-square test. Standard deviation isn't what you want, since that's about the shape of the distribution, not how accurate you estimate is of the actual distribution. Also, note that most of these things are about "normal" distributions, and not all distributions are normal.

Upvotes: 0

Related Questions