user135172
user135172

Reputation: 293

Is there a Matlab function for calculating std of a binomial distribution?

I have a binary vector V, in which each entry describes success (1) or failure (0) in the relevant trial out of a whole session. (the length of the vector denotes the number of trials in the session). I can easily calculate the success rate of the session (by taking the mean of the vector i.e. (sum(V)/length(V))).

However I also need to know the variance or std of each session.

In order to calculate that, is it OK to use the Matlab std function (i.e. to take std(V)/length(V))? Or, should I use something which is specifically suited for the binomial distribution? Is there a Matlab std (or variance) function which is specific for a "success/failure" distribution?

Thanks

Upvotes: 1

Views: 210

Answers (1)

SecretAgentMan
SecretAgentMan

Reputation: 2854

If you satisfy the assumptions of the Binomial distribution,

  • a fixed number of n independent Bernoulli trials,
  • each with constant success probability p,

then I'm not sure that is necessary, since the parameters n and p are available from your data.

Note that we model number of successes (in n trials) as a random variable distributed with the Binomial(n,p) distribution.

n = length(V);
p = mean(V);     % equivalently, sum(V)/length(V)   
                 % the mean is the maximum likelihood estimator (MLE) for p
                 % note: need large n or replication to get true p

Then the standard deviation of the number of successes in n independent Bernoulli trials with constant success probability p is just sqrt(n*p*(1-p)).

Of course you can assess this from your data if you have multiple samples. Note this is different from std(V). In your data formatting, it would require having multiple vectors, V1, V2, V2, etc. (replication), then the sample standard deviation of the number of successes would obtained from the following.

% Given V1, V2, V3 sets of Bernoulli trials
std([sum(V1) sum(V2) sum(V3)])

If you already know your parameters: n, p

You can obtain it easily enough.

n = 10;
p = 0.65;
pd = makedist('Binomial',n, p)
std(pd)                                % 1.5083

or

sqrt(n*p*(1-p))                        % 1.5083

as discussed earlier.


Does the standard deviation increase with n ?
The OP has asked:

Something is bothering me.. if std = sqrt(n*p*(1-p)), then it increases with n. Shoudn't the std decrease when n increases?

Confirmation & Derivation:

Definitions:

Definitions

Then we know that
Distribution

Then just from definitions of expectation and variance we can show the variance (similarly for standard deviation if you add the square root) increases with n.

Variance derivation

Since the square root is a non-decreasing function, we know the same relationship holds for the standard deviation.

Upvotes: 1

Related Questions