VeilEclipse
VeilEclipse

Reputation: 2866

Numpy Double summation

enter image description here

enter image description here

My implementation is:

def getGaussianValue(x, mean, covariance):
    part1 = 1/np.power(2*np.pi, x.shape[0]/2)
    part2 = 1/np.sqrt(np.linalg.det(covariance))
    part3 = np.exp(-(0.5) * np.matrix(x-mean) * np.matrix(np.linalg.inv(covariance)) *  np.matrix(x-mean).T)
    return part1 * part2 * part3 
def getLogLikelihood(K, data, pii, mean, covariance):
    sum_i = 0.0
    for i in range(data.shape[0]):
        sum_k = 0.0
        for k in range(K):
            sum_k += pii[k] * getGaussianValue(data[i], mean, covariance)
        sum_i += np.log(sum_k)
    return sum_i

Here N=150, K=3,X is a 150x4numpy array, Covariance(Sigma) is 3x4x4 numpy array and mean(mu) is 3x4 numpy array. How to make it faster?

Upvotes: 1

Views: 1346

Answers (1)

Pavel
Pavel

Reputation: 7562

it's always a good idea to precompute everything that's possible and never calculate anything twice.

  1. just invert the covariance once and store the inverted matrices
  2. also precompute the normalization terms part1 and part2 only once rather than on every call of getGaussianValue
  3. no need to calculate np.matrix(x-mean) twice (don't know whether numpy optimizes it anyway)
  4. consider using numpy's builtins like scipy.stats.multivariate_normal.pdf

Upvotes: 3

Related Questions