Reputation: 4080
I've made a GMModel using fitgmdist. The idea is to produce two gaussian distributions on the data and use that to predict their labels. How can I determine if a future data point fits into one of those distributions? Am I misunderstanding the purpose of a GMModel?
clear;
load C:\Users\Daniel\Downloads\data1 data;
% Mixed Gaussian
GMModel = fitgmdist(data(:, 1:4),2)
Produces
GMModel =
Gaussian mixture distribution with 2 components in 4 dimensions
Component 1:
Mixing proportion: 0.509709
Mean: 2.3254 -2.5373 3.9288 0.4863
Component 2:
Mixing proportion: 0.490291
Mean: 2.5161 -2.6390 0.8930 0.4833
Edit:
clear;
load C:\Users\Daniel\Downloads\data1 data;
% Mixed Gaussian
GMModel = fitgmdist(data(:, 1:4),2);
P = posterior(GMModel, data(:, 1:4));
X = round(P)
blah = X(:, 1)
dah = data(:, 5)
Y = max(mean(blah == dah), mean(~blah == dah))
Upvotes: 1
Views: 945
Reputation: 692
I don't understand why you round the posterior values. Here is what I would do after fitting a mixture model.
P = posterior(GMModel, data(:, 1:4));
[~,Y] = max(P,[],2);
Now Y contains the labels that is index of which Gaussian the data belongs in-terms of maximum aposterior (MAP). Important thing to do is to align the labels before evaluating the classification error. Since renumbering might happen, i.e., Gaussian component 1 in the true might be component 2 in the clustering produced and so on. May be that why you are getting varying accuracy ranging from 51% accuracy to 95% accuracy, in addition to other subtle problems.
Upvotes: 1