Reputation: 2866
My implementation is:
def getGaussianValue(x, mean, covariance):
part1 = 1/np.power(2*np.pi, x.shape[0]/2)
part2 = 1/np.sqrt(np.linalg.det(covariance))
part3 = np.exp(-(0.5) * np.matrix(x-mean) * np.matrix(np.linalg.inv(covariance)) * np.matrix(x-mean).T)
return part1 * part2 * part3
def getLogLikelihood(K, data, pii, mean, covariance):
sum_i = 0.0
for i in range(data.shape[0]):
sum_k = 0.0
for k in range(K):
sum_k += pii[k] * getGaussianValue(data[i], mean, covariance)
sum_i += np.log(sum_k)
return sum_i
Here N=150, K=3
,X
is a 150x4
numpy array, Covariance(Sigma)
is 3x4x4
numpy array and mean(mu)
is 3x4
numpy array. How to make it faster?
Upvotes: 1
Views: 1346
Reputation: 7562
it's always a good idea to precompute everything that's possible and never calculate anything twice.
part1
and part2
only once rather than on every call of getGaussianValue
np.matrix(x-mean)
twice (don't know whether numpy optimizes it anyway)scipy.stats.multivariate_normal.pdf
Upvotes: 3