Reputation: 2989
Is there a way in python to obtain the covariance matrix given the mean and sample data points
Example:
mean = [3 3.6]
data = [[1 2]
[2 3]
[3 3]
[4 5]
[5 5]]
I know how to calculate the same by substituting these values in the formula. But is there a build in function in python which does this for me. I know there is one in Matlab, but I am not sure about python.
Upvotes: 12
Views: 10164
Reputation: 500317
numpy.cov()
can be used to compute the covariance matrix:
In [1]: import numpy as np
In [2]: data = np.array([[1,2], [2,3], [3,3], [4,5], [5,5]])
In [3]: np.cov(data.T)
Out[3]:
array([[ 2.5, 2. ],
[ 2. , 1.8]])
By default, np.cov()
expects each row to represent a variable, with observations in the columns. I therefore had to transpose your matrix (by using .T
).
An alternative way to achieve the same thing is by setting rowvar
to False
:
In [15]: np.cov(data, rowvar=False)
Out[15]:
array([[ 2.5, 2. ],
[ 2. , 1.8]])
Upvotes: 23