Reputation: 5297
This should be a simple question, either I am missing information, or I have mis-coded this.
I am trying to implement Mahalanabois distance in python which I am following from the formula in python.
My code is as follows:
a = np.array([[1, 3, 5]])
b = np.array([[4, 5, 6]])
X = np.empty((0,3), float)
X = np.vstack([X, [2,3,4]])
X = np.vstack([X, a])
X = np.vstack([X, b])
n = ((a-b).T)*(np.cov(X)**-1)*(a-b)
dist = np.sqrt(n)
dist returns a 3x3 array but should I not be expecting a single number representing the distance?
dist = array([[ 1.5 , 1.73205081, 1.22474487],
[ 1.73205081 , 2. , 1.41421356],
[ 1.22474487 , 1.41421356, 1. ]])
Wikipedia does not suggest (to me) that it should return a matrix. Googling implementations of mahalanbois distance in python I have not found something to compare it to.
Upvotes: 0
Views: 84
Reputation: 31672
From wiki page you could see, that a
and b
are vectors but in your case they are arrays. So you need reverse transposing. And also there should be matrix multiplication. In numpy *
means element-wise multiplication, for matrix you should use np.dot
function or .dot
method of the np.array
. For your case answer is:
n = (a-b).dot((np.cov(X)**-1).dot((a-b).T))
dist = np.sqrt(n)
In [54]: n
Out[54]: array([[ 25.]])
In [55]: dist
Out[55]: array([[ 5.]])
EDIT
As @roadrunner66 noticed you should use inverse matrix instead of inverse matrix of element. Usually np.linalg.inv
works for that cases but for that you've got Singular Error and you need to use np.linalg.pinv
:
n = (a-b).dot((np.linalg.pinv(np.cov(X))).dot((a-b).T))
dist = np.sqrt(n)
In [90]: n
Out[90]: array([[ 1.77777778]])
In [91]: dist
Out[91]: array([[ 1.33333333]])
Upvotes: 2