Reputation: 376
I have previously posted this on the Mathworks Community, but am reposting here for a wider audience...
I have a 1 dimensional Histogram, to which I want to fit gaussians:
In the above example I need to find the centres of the 4 dominant peaks, however, the number of peaks may vary in a different Histogram. Below is a MWE of my approach:
bins = 2000;
fsc_hist = histogram(FSC_data.FSC_HF,bins);hold on;
%% smooth data to get rid of discretization
fscValues = fsc_hist.Values;
binStep = (fsc_hist.BinLimits(2)-fsc_hist.BinLimits(1))/fsc_hist.NumBins;
binCenters = binStep * [0:fsc_hist.NumBins-1];
smoothValues = smooth(binCenters, fscValues, 0.1, 'rloess');
%% fit GMM
expectedPeaks = 4;
gmm = fitgmdist(smoothValues, expectedPeaks, 'RegularizationValue', 0.1);
Which returns the following GMM result:
Gaussian mixture distribution with 4 components in 1 dimensions
Component 1: Mixing proportion: 0.294734 Mean: 0.2417
Component 2: Mixing proportion: 0.152275 Mean: 41.9369
Component 3: Mixing proportion: 0.344658 Mean: 6.8231
Component 4: Mixing proportion: 0.208333 Mean: 24.6758
Obviously, the calculated Mean values of the gaussians is not correct.
Where is my approach going wrong? I believe that either my first input to the fitgmdist
function must somehow be normalised, or that I need to post-process the output. So far, my attempts have failed.
Upvotes: 0
Views: 934
Reputation: 71
What's happening is that the mixing models is giving you the means of Gaussian distributions of the counts. Instead of inputting the histogram into fitgmdist, you should input the raw FSC_data.FSC_HF data into the first argument.
Upvotes: 1