Reputation: 101
Given a 2-dimensional dataset, I would like to plot an Ellipse around the data. For this, I first calculated the Covariance Matrix and its associated Eigenvalues:
cov = np.cov(X.T)
eigenvalues, eigenvectors = np.linalg.eig(cov)
I would now like to plot an Ellipse around the two Eigenvectors using matplotlib, but can't figure out, how. I suppose some kind of projection (e.g. dot product) will be necessary?
Any help is greatly appreciated!
Upvotes: 2
Views: 4978
Reputation: 128
I took an arbitrary symmetric matrix that you can easly change.
import numpy as np
import matplotlib.pyplot as plt
# I'm taking an arbitrary symmetric matrix
COV = np.array([[1, -0.7],
[-0.7, 4]])
eigenvalues, eigenvectors = np.linalg.eig(COV)
theta = np.linspace(0, 2*np.pi, 1000);
ellipsis = (1/np.sqrt(eigenvalues[None,:]) * eigenvectors) @ [np.sin(theta), np.cos(theta)]
plt.plot(ellipsis[0,:], ellipsis[1,:])
In order to understand why these equations give the ellipse you wanted, you have to be familiar with ellipse equation in general (matrix) form :
The idea is to calculate the sizes of each axis in ellipse reference which is given by :
then pass to cartesian coordinate system. (We plot it in parametric form)
Upvotes: 1
Reputation: 14654
Based on your example, I will just create some data here
import numpy as np;
X = np.random.randn(100, 2)
X[:,1] += 0.3 * X[:,0]
cov = np.cov(X.T)
eigenvalues, eigenvectors = np.linalg.eig(cov)
The eigenvalues indicate the variance gain in each axis. So the isolines of the output distribution has the axis length proportional to the square root of the eigenvalues (standard deviation)
To plot the ellipsis you can use the parametric equation
import matplotlib.pyplot as plt;
theta = np.linspace(0, 2*np.pi, 1000);
ellipsis = (np.sqrt(eigenvalues[None,:]) * eigenvectors) @ [np.sin(theta), np.cos(theta)]
plt.plot(ellipsis[0,:], ellipsis[1,:])
Upvotes: 2