Jojo
Jojo

Reputation: 1117

Probability of occurrence of random variable chosen from Normal Distribution

I was just wondering, with the code below, how would I determine the probability of occurrence associated with one of the random variables that are chosen. For example, if a particular variable is chosen for new_E11, how would I know its probability of occurrence from the Normal distribution?

function [new_E11, new_E22] = elasticmodulusrng()

new_E11 = normrnd(136e9,9.067e9,[1 1]);

new_E22 = normrnd(8.9e9,2.373e9,[1 1]);

while new_E11<=136e9-3*9.067e9 && new_E11>=136e9+3*9.067e9
        new_E11 = normrnd(136e9,9.067e9,[1 1]);
end

while new_E11<=8.9e9-3*2.373e9 && new_E11>=8.9e9+3*2.373e9
        new_E22 = normrnd(8.9e9,2.373e9,[1 1]);
end

Thanks

Upvotes: 2

Views: 633

Answers (1)

Floris
Floris

Reputation: 46375

The probability of any specific value is zero (well, really about 1 part in 2^64... that's how many different floating point numbers there are since they are represented by a 64 bit binary; and they are not equally probable so the answer is more complicated than that).

A better question is "what is the probability density" - which can be described as the limit of

P(x - dx < X < x + dx)
----------------------
        2 * dx

This is in fact the definition of your "probability density function" or PDF. In your case, you are using the normal distribution with a mean mu and standard deviation sigma, then

pdf(X) = exp(-(X-mu)*(X-mu) / (2 * sigma * sigma)) / (sqrt(2.0*pi) * sigma)

If you really wanted the answer to your original question, you would look at the smallest delta (one least significant bit in the mantissa) of the X value for the value you are interested in. That would be your delta X, and you could then compute the actual answer (warning - not only would it be tiny, it might be hard to compute this without running into significant rounding errors. Would be hard.)

Further thought:

You test for the limit of the value to be within +- 3 sigma, and keep going until it is. That means that you artifically increase the probability very slightly - since your value is always from a particular part of the range (for which the total probability is less than 1), you need to multiply the PDF by (1/p) where p is the integral of the standard normal distribution from -3 to 3, or 0.9973. So in your case the above formula would underestimate the probability by 0.27%.

suggestion a couple of style suggestions for your code. Consider the following (gives equivalent results to your code):

function [new_E11, new_E22] = elasticmodulusrng()
  % returns randomly distributed modulus 
  % keeping the value within +- 3 standard deviations
  % identify the constants up front
  mean11 = 136e9;
  stdv11 = 9.067e9;
  mean22 = 8.9e9;
  stdv11 = 2.373e9;

  % compute the values
  new_E11 = normRandLimit(mean11, stdv11, [-3 3]);
  new_E22 = normRandLimit(mean22, stdv22, [-3 3]);

function rr = normRandLimit(m, s, b)
% helper function in the same file
% returns a normally distributed random variate
% with mean m, standard deviation s
% within limits [mean + b(1)*s,  mean + b(2)*s]

  while true 
    rr = randn(1); % doesn't need statistics toolbox - scale later
    if( rr > b(1) && rr < b(2) )
      break;  % found acceptable value
    end
  end
  % now apply scaling:
  rr = rr * s + m;

Things to note about this:

  1. "magic numbers" have a sensible name, and are all at the top of the code
  2. Computation of the "random number within limits" is relegated to a separate function
  3. Using randn rather than normrnd (so no need to tie up statistics toolbox license)
  4. Compute scaling after "good number" was computed
  5. Repeated code has been eliminated
  6. It is more obvious what is going on from the structure (and the comments)

All of this adds up to "better" code - in the sense that it's easier to debug, maintain. And when you see your own code in six months time, you can still read it...

Upvotes: 4

Related Questions