Reputation: 31
I am having trouble fitting a multivariate gaussian distribution to my dataset, more specifically, finding a mean vector (or multiple mean vectors). My dataset is an N x 8 matrix and currently I am using this code:
muVector = np.mean(Xtrain, axis=0)
where Xtrain is my training data set.
For the covariance I am building it using a arbitrary variance value (.5) and doing:
covariance = np.dot(.5, np.eye(N,N)
where N is the number of observations.
But when I construct my Phi matrix, I am getting all zeros. Here is my code:
muVector = np.mean(Xtrain, axis=0)
# get covariance matrix from Xtrain
cov = np.dot(var, np.eye(N,N))
cov = np.linalg.inv(cov)
# build Xtrain Phi
Phi = np.ones((N,M))
for row in range(N):
temp = Xtrain[row,:] - muVector
temp.shape = (1,M)
temp = np.dot((-.5), temp)
temp = np.dot(temp, cov)
temp = np.dot(temp, (Xtrain[row,:] - muVector))
Phi[row,:] = np.exp(temp)
Any help is appreciated. I think I might have to use np.random.multivariate_normal()? But I do not know how to use it in this case.
Upvotes: 3
Views: 9978
Reputation: 124
By "Phi" I believe that you mean the probability density function (pdf) that you want to estimate. In this case, the covariance matrix should be MxM and the output Phi will be Nx1:
# -*- coding: utf-8 -*-
import numpy as np
N = 1024
M = 8
var = 0.5
# Creating a Xtrain NxM observation matrix.
# Its muVector is [0, 1, 2, 3, 4, 5, 6, 7] and the variance for all
# independent random variables is 0.5.
Xtrain = np.random.multivariate_normal(np.arange(8), np.eye(8,8)*var, N)
# Estimating the mean vector.
muVector = np.mean(Xtrain, axis=0)
# Creating the estimated covariance matrix and its inverse.
cov = np.eye(M,M)*var
inv_cov = np.linalg.inv(cov)
# Normalization factor from the pdf.
norm_factor = 1/np.sqrt((2*np.pi)**M * np.linalg.det(cov))
# Estimating the pdf.
Phi = np.ones((N,1))
for row in range(N):
temp = Xtrain[row,:] - muVector
temp.shape = (1,M)
temp = np.dot(-0.5*temp, inv_cov)
temp = np.dot(temp, (Xtrain[row,:] - muVector))
Phi[row] = norm_factor*np.exp(temp)
Alternatively, you can use the pdf
method from scipy.stats.multivariate_normal
:
# -*- coding: utf-8 -*-
import numpy as np
from scipy.stats import multivariate_normal
N = 1024
M = 8
var = 0.5
# Creating a Xtrain NxM observation matrix.
# Its muVector is [0, 1, 2, 3, 4, 5, 6, 7] and the variance for all
# independent random variables is 0.5.
Xtrain = np.random.multivariate_normal(np.arange(8), np.eye(8,8)*var, N)
# Estimating the mean vector.
muVector = np.mean(Xtrain, axis=0)
# Creating the estimated covariance matrix.
cov = np.eye(M,M)*var
Phi2 = multivariate_normal.pdf(Xtrain, mean=muVector, cov=cov)
Both Phi
and Phi2
output arrays will be equal.
Upvotes: 3