crysis405
crysis405

Reputation: 1131

Fitting exponential and 5 gaussians to data in python

I am trying fit an exponential function and 5 Gaussians to my data. What I am aiming for is something along these lines: (where gDNA Fit is the exponential; 1-5Nuc Fit are the 5 Gaussians; Total fit is the sum of all the fits)

enter image description here

The way I approached it was fitting the exponential and then based on that introduce a cut-off that would allow me to fit the gaussians without taking into consideration the already fitted data. (I have already cut the data at 100 as this is where it dips down to 0)

The problem is I don't seems to be able to fit the exponential properly and the gaussians are off the scale:

from scipy.optimize import curve_fit
from pylab import *
import matplotlib.pyplot

#Exponential

x = np.array([1.010000000000000000e+02,1.100000000000000000e+02,1.190000000000000000e+02,1.280000000000000000e+02,1.370000000000000000e+02,1.460000000000000000e+02,1.550000000000000000e+02,1.640000000000000000e+02,1.730000000000000000e+02,1.820000000000000000e+02,1.910000000000000000e+02,2.000000000000000000e+02,2.090000000000000000e+02,2.180000000000000000e+02,2.270000000000000000e+02,2.360000000000000000e+02,2.450000000000000000e+02,2.540000000000000000e+02,2.630000000000000000e+02,2.720000000000000000e+02,2.810000000000000000e+02,2.900000000000000000e+02,2.990000000000000000e+02,3.080000000000000000e+02,3.170000000000000000e+02,3.260000000000000000e+02,3.350000000000000000e+02,3.440000000000000000e+02,3.530000000000000000e+02,3.620000000000000000e+02,3.710000000000000000e+02,3.800000000000000000e+02,3.890000000000000000e+02,3.980000000000000000e+02,4.070000000000000000e+02,4.160000000000000000e+02,4.250000000000000000e+02,4.340000000000000000e+02,4.430000000000000000e+02,4.520000000000000000e+02,4.610000000000000000e+02,4.700000000000000000e+02,4.790000000000000000e+02,4.880000000000000000e+02,4.970000000000000000e+02,5.060000000000000000e+02,5.150000000000000000e+02,5.240000000000000000e+02,5.330000000000000000e+02,5.420000000000000000e+02,5.510000000000000000e+02,5.600000000000000000e+02,5.690000000000000000e+02,5.780000000000000000e+02,5.870000000000000000e+02,5.960000000000000000e+02,6.050000000000000000e+02,6.140000000000000000e+02,6.230000000000000000e+02,6.320000000000000000e+02,6.410000000000000000e+02,6.500000000000000000e+02,6.590000000000000000e+02,6.680000000000000000e+02,6.770000000000000000e+02,6.860000000000000000e+02,6.950000000000000000e+02,7.040000000000000000e+02,7.130000000000000000e+02,7.220000000000000000e+02,7.310000000000000000e+02,7.400000000000000000e+02,7.490000000000000000e+02,7.580000000000000000e+02,7.670000000000000000e+02,7.760000000000000000e+02,7.850000000000000000e+02,7.940000000000000000e+02,8.030000000000000000e+02,8.120000000000000000e+02,8.210000000000000000e+02,8.300000000000000000e+02,8.390000000000000000e+02,8.480000000000000000e+02,8.570000000000000000e+02,8.660000000000000000e+02,8.750000000000000000e+02,8.840000000000000000e+02,8.930000000000000000e+02,9.020000000000000000e+02,9.110000000000000000e+02,9.200000000000000000e+02,9.290000000000000000e+02,9.380000000000000000e+02,9.470000000000000000e+02,9.560000000000000000e+02,9.650000000000000000e+02,9.740000000000000000e+02,9.830000000000000000e+02,9.920000000000000000e+02])
y = np.array([3.579280000000000000e+05,3.172290000000000000e+05,1.759610000000000000e+05,1.352610000000000000e+05,1.069130000000000000e+05,9.721000000000000000e+04,9.908200000000000000e+04,1.168480000000000000e+05,1.266880000000000000e+05,1.264760000000000000e+05,1.279850000000000000e+05,1.198880000000000000e+05,1.117730000000000000e+05,1.005850000000000000e+05,9.038500000000000000e+04,7.532400000000000000e+04,6.235500000000000000e+04,5.249600000000000000e+04,4.445600000000000000e+04,3.808000000000000000e+04,3.612100000000000000e+04,3.460600000000000000e+04,3.209700000000000000e+04,3.008200000000000000e+04,3.090700000000000000e+04,3.208600000000000000e+04,2.949700000000000000e+04,3.111600000000000000e+04,3.125700000000000000e+04,3.152700000000000000e+04,3.198700000000000000e+04,3.373800000000000000e+04,3.171200000000000000e+04,3.124900000000000000e+04,3.109700000000000000e+04,3.002200000000000000e+04,2.720100000000000000e+04,2.413600000000000000e+04,1.873100000000000000e+04,1.768900000000000000e+04,1.510600000000000000e+04,1.358800000000000000e+04,1.354400000000000000e+04,1.198900000000000000e+04,1.182800000000000000e+04,6.926000000000000000e+03,1.230000000000000000e+04,3.734000000000000000e+03,6.631000000000000000e+03,7.085000000000000000e+03,7.151000000000000000e+03,7.195000000000000000e+03,7.265000000000000000e+03,6.966000000000000000e+03,6.823000000000000000e+03,6.357000000000000000e+03,5.977000000000000000e+03,5.464000000000000000e+03,4.941000000000000000e+03,4.543000000000000000e+03,3.992000000000000000e+03,3.593000000000000000e+03,3.156000000000000000e+03,2.955000000000000000e+03,2.740000000000000000e+03,2.701000000000000000e+03,2.528000000000000000e+03,2.481000000000000000e+03,2.527000000000000000e+03,2.476000000000000000e+03,2.456000000000000000e+03,2.461000000000000000e+03,2.420000000000000000e+03,2.346000000000000000e+03,2.326000000000000000e+03,2.278000000000000000e+03,2.108000000000000000e+03,1.893000000000000000e+03,1.771000000000000000e+03,1.654000000000000000e+03,1.547000000000000000e+03,1.389000000000000000e+03,1.325000000000000000e+03,1.130000000000000000e+03,1.057000000000000000e+03,9.460000000000000000e+02,9.790000000000000000e+02,8.990000000000000000e+02,8.460000000000000000e+02,8.360000000000000000e+02,8.040000000000000000e+02,8.330000000000000000e+02,7.690000000000000000e+02,7.020000000000000000e+02,7.360000000000000000e+02,6.390000000000000000e+02,6.690000000000000000e+02,6.770000000000000000e+02,6.100000000000000000e+02,5.700000000000000000e+02])

def func(x, a, c, d):
    return a*np.exp(-c*x)+d


#print np.exp(-x)

popt, pcov = curve_fit(func, x, y, p0=(1, 0.01, 1))

yy = func(x, *popt)

matplotlib.pyplot.plot(x, y, 'ko')
matplotlib.pyplot.plot(x, yy)

#gaussian

from sklearn import mixture
import scipy

gmm = mixture.GMM(n_components=5, covariance_type='full')
gmm.fit(y)
pdfs = [p * scipy.stats.norm.pdf(x, mu, sd) for mu, sd, p in zip(gmm.means_, (gmm.covars_)**2, gmm.weights_)]


density = np.sum(np.array(pdfs), axis=0)
#print density
matplotlib.pyplot.plot(x, density)

show()

Upvotes: 0

Views: 1089

Answers (1)

Christian K.
Christian K.

Reputation: 2823

If you do not mind to use least squares as opposed to maximum likelyhood I would suggest to fit the whole model at once, including the exponential with e.g. scipy curve_fit. You will never get a good fit to the exponential if you ignore the existance of the gaussian peaks. I recommend to use peak-o-mat (http://lorentz.sf.net) which is an interactive curve fitting software written in python. Within seconds you can get a result like this:

multi peak fit example

Upvotes: 1

Related Questions