Reputation: 1
I am trying to implement a multivariate Gaussian Mixture Model and am trying to calculate the probability distribution function using tensors. There are n data points, k clusters, and d dimensions. So far, I have two tensors. One is a (n,k,d)
tensor of centered data points and the other is a kxdxd
tensor of covariance matricies. I can compute an nxk
matrix of probabilities by doing
centered = np.repeat(points[:,np.newaxis,:],K,axis=1) - mu[np.newaxis,:] # KxNxD
prob = np.zeros(n,k)
constant = 1/2/np.pow(np.pi, d/2)
for n in range(centered.shape[1]):
for k in range(centered.shape[0]):
p = centered[n,k,:][np.newaxis] # 1xN
power = -1/2*(p @ np.linalg.inv(sigma[k,:,:]) @ p.T)
prob[n,k] = constant * np.linalg.det(sigma[k,:,:]) * np.exp(power)
where sigma
is the triangularized kxdxd
matrix of covariances and centered
are mypoints. What is a more pythonic way of doing this using numpy's tensor capabilites?
Upvotes: 0
Views: 52
Reputation: 231615
Just a couple of quick observations:
I don't see you using p
in the loop; is this a mistake? Using n
instead?
The T
in centered[n,k,:].T
does nothing; with that index the array is 1d
I'm not sure if np.linal.inv
can handle batches of arrays, allowing np.linalg.inv(sigma)
.
@
allows batches, just so long as the last 2 dim are the ones entering into the dot
(with the usual last of A, 2nd to the last of B
rule; einsum
can also be used.
again does np.linalg.det
handle batches?
Upvotes: 0