Daniel Paczuski Bak
Daniel Paczuski Bak

Reputation: 4080

GMModel - how do I use this to predict a label's data?

I've made a GMModel using fitgmdist. The idea is to produce two gaussian distributions on the data and use that to predict their labels. How can I determine if a future data point fits into one of those distributions? Am I misunderstanding the purpose of a GMModel?

clear;
load C:\Users\Daniel\Downloads\data1 data;


% Mixed Gaussian
GMModel = fitgmdist(data(:, 1:4),2)

Produces

GMModel = 

Gaussian mixture distribution with 2 components in 4 dimensions
Component 1:
Mixing proportion: 0.509709
Mean:    2.3254   -2.5373    3.9288    0.4863

Component 2:
Mixing proportion: 0.490291
Mean:    2.5161   -2.6390    0.8930    0.4833

Edit:

clear;
load C:\Users\Daniel\Downloads\data1 data;



% Mixed Gaussian
GMModel = fitgmdist(data(:, 1:4),2);


P = posterior(GMModel, data(:, 1:4));
X = round(P)

blah = X(:, 1)
dah = data(:, 5)

Y = max(mean(blah == dah), mean(~blah == dah))

Upvotes: 1

Views: 945

Answers (1)

isrish
isrish

Reputation: 692

I don't understand why you round the posterior values. Here is what I would do after fitting a mixture model.

P = posterior(GMModel, data(:, 1:4)); [~,Y] = max(P,[],2);

Now Y contains the labels that is index of which Gaussian the data belongs in-terms of maximum aposterior (MAP). Important thing to do is to align the labels before evaluating the classification error. Since renumbering might happen, i.e., Gaussian component 1 in the true might be component 2 in the clustering produced and so on. May be that why you are getting varying accuracy ranging from 51% accuracy to 95% accuracy, in addition to other subtle problems.

Upvotes: 1

Related Questions