iron2man
iron2man

Reputation: 1827

Python - curve_fit gives incorrect coefficients

I'm trying to pass two arrays for a fitting function, that takes both values.

Data file:

Column 1: Time Column 2: Temperature Column 3: Volume Column 4: Pressure

0.000,0.946,4.668,0.981
0.050,0.946,4.668,0.981
0.100,0.946,4.669,0.981
0.150,0.952,4.588,0.996
0.200,1.025,4.008,1.117
0.250,1.210,3.093,1.361
0.300,1.445,2.299,1.652
0.350,1.650,1.803,1.887
0.400,1.785,1.524,2.038
0.450,1.867,1.340,2.145
0.500,1.943,1.138,2.280
0.550,2.019,0.958,2.411
0.600,2.105,0.750,2.587
0.650,2.217,0.542,2.791
0.700,2.332,0.366,2.978
0.750,2.420,0.242,3.116
0.800,2.444,0.219,3.114
0.850,2.414,0.219,3.080

here is the code

import numpy as np
from scipy.optimize import curve_fit 

# Importing the Data
Time_Air1 = []
Vol_Air1 = []
Temp_Air1 = []
Pres_Air1 = []
with open('Good_Air_Run1.csv', 'r') as Air1:
    reader = csv.reader(Air1, delimiter=',')
    for row in reader:
        Time_Air1.append(row[0])
        Temp_Air1.append(row[1])
        Vol_Air1.append(row[2])
        Pres_Air1.append(row[3])

# Arrays are now passable floats
Time_Air1 = np.float32(np.array(Time_Air1))
Vol_Air1 = np.float32(np.array(Vol_Air1))
Temp_Air1 = np.float32(np.array(Temp_Air1))
Pres_Air1 = np.float32(np.array(Pres_Air1))

# fitting Model
def model_Gamma(V, gam, C):
    return -gam*np.log(V) + C

# Air Data Fitting Data
x1 = Vol_Air1
y1 = Pres_Air1

p0_R1 = (1.0 ,1.0) 
optR1, pcovR1 = curve_fit(model_Gamma, x1, y1, p0_R1)
gam_R1, C_R1 = optR1
gam_R1p, C_R1p = pcovR1
y1Mair = model_Gamma2(x_air1, gam_R1, C_R1)

compute the gamma coefficient, but it's not giving me the value i'm expecting, ~1.2. It gives me ~0.72

Yes this is the correct value because my friend fit the data into gnuplot and got that value.

If there is any information needed to actually try this, i'm happy to supply it.

Upvotes: 1

Views: 460

Answers (1)

Oliver W.
Oliver W.

Reputation: 13459

Caveat: the result obtained here for gamma (about 1.7) still deviates from the postulated 1.2. This answer merely highlights the source of possible errors and illustrates what a good fit might look like.

You are trying to fit data where the dependent variable is related to the independent variable by a model that resembles that of adiabatic processes for ideal gases. Here, the pressure and the volume of a gass are related through

pressure * volume**gamma = constant

When you rearrange the left hand side and right hand side, you get:

pressure = constant * volume**-gamma

or in logarithmic form:

log(pressure) = log(constant) - gamma * log(volume)

You could fit the pressure data to the volume data using either of these 2 forms, but the fit might not be optimal, because of measurement errors. One such error could be a fixed offset (e.g. some solid object is present in a beaker: the volume scale on the beaker will not represent accurately the volume of any liquid you pour in it). When you account for such errors, often times a fit becomes markedly better.

Below, I've shown the fitting of your data using 3 models: the first is your model, the second takes into account a volume offset, and the third is a non-logarithmic variant of the 2nd model (it is basically the 2nd equation, with an optional volume offset). Note that in your code when you fit to what I call model1, you do not pass log(pressure) to the model, which would only make sense in case your pressure data is already tabulated on a logarithmic scale.

>>> import numpy as np
>>> from scipy.optimize import curve_fit
>>> data = np.genfromtxt('/tmp/datafile.txt',
...     names=('time', 'temp', 'vol', 'press'), delimiter=',', usecols=range(4))
>>> def model1(volume, gamma, c):
...         return np.log(c) - gamma*np.log(volume)
...
>>> def model2(volume, gamma, c, volume_offset):
...         return np.log(c) - gamma*np.log(volume + volume_offset)
...
>>> def model3(volume, gamma, c, volume_offset):
...         return c * (volume + volume_offset)**(-gamma)
...
>>> vol, press = data['vol'], data['press']
>>> guess1, _ = curve_fit(model1, vol, np.log(press))
>>> guess2, _ = curve_fit(model2, vol, np.log(press))
>>> guess3, _ = curve_fit(model3, vol, press)
>>> guess1, guess2, guess3
(array([ 0.38488521,  2.04536926]),
 array([  1.7269364 ,  44.57369479,   4.44625865]),
 array([  1.73186133,  45.20087949,   4.46364872]))
>>> rms = lambda x: np.sqrt(np.mean(x**2))
>>> rms( press - np.exp(model1(vol, *guess1)))
0.29464410744456304
>>> rms(press - model3(vol, *guess3))
0.012672077620951249

Notice how guess2 and guess3 are nearly identical The last two lines indicate the rms error. You'll notice that it is smaller for the model that takes into account the offset (if you plot them, you'll see the fit is much better than when you use model1*).

adiabatic process - curve fitting

As a final remark, have a look at numpy's excellent functions for importing data, like the one I've shown here (np.genfromtxt), as they can save you a lot of tedious typing, like I demonstrated in this code.

Footnote: * when you plot using model1, don't forget to put everything back to linear scale, like this:

plt.plot(vol, np.exp(model1(vol, *guess1)))

Upvotes: 1

Related Questions