Reputation: 124
import numpy as np
np.random.seed(12)
num_observations = 5
x1 = np.random.multivariate_normal([1, 1], [[1, .75],[.75, 1]], num_observations)
sum = 0
for i in x1:
sum += i
print(sum/num_observations)
In this snippet the output is coming as [ 0.95766788 0.79287083] but shouldn't it be [1,1] as while generating the multivariate distribution I have taken the mean as 1,1?
Upvotes: 1
Views: 350
Reputation: 40878
What multivariate_normal
does is:
Draw random samples from a multivariate normal distribution.
With the key word here being draw. You are basically taking a fairly small sample that is not guaranteed to have the same mean as the distribution itself. (That's the mathematical expectation, nothing more, and your sample size is 5.)
x1.mean(axis=0)
# array([ 0.958, 0.793])
Consider testing this by taking a much larger sample, where the law of large numbers dictates that your means should more reliably approach 1.00000...
x2 = np.random.multivariate_normal([1, 1], [[1, .75],[.75, 1]], 10000)
x2.mean(axis=0)
# array([ 1.001, 1.009])
In other words: say you had a population of 300 million people where the average age was 50. If you randomly picked 5 of them, you would expect your mean of the 5 to be 50, but it probably wouldn't be exactly 50, and might even be significantly far off from 50.
Upvotes: 2