TheChemist
TheChemist

Reputation: 153

How to extrapolate a function based on x,y values?

Ok so I started with Python a few days ago. I mainly use it for DataScience because I am an undergraduate chemistry student. Well, now I got a small problem on my hands, as I have to extrapolate a function. I know how to make simple diagrams and graphs, so please try to explain it as easy to me as possible. I start off with:

from matplotlib import pyplot as plt
from matplotlib import style
style.use('classic')
x = [0.632455532, 0.178885438, 0.050596443, 0.014310835, 0.004047715]
y = [114.75, 127.5, 139.0625, 147.9492188, 153.8085938]
x2 = [0.707, 0.2, 0.057, 0.016, 0.00453]
y2 = [2.086, 7.525, 26.59375,87.03125, 375.9765625]

so with these values I have to work out a way to extrapolate in order to get a y(or y2) value when my x=0. I know how to do this mathematically, but I would like to know if python can do this and how do I execute it in Python. Is there a simple way? Can you give me maybe an example with my given values? Thank you

Upvotes: 2

Views: 4236

Answers (2)

Hugh Bothwell
Hugh Bothwell

Reputation: 56644

Taking a quick look at your data,

from matplotlib import pyplot as plt
from matplotlib import style
style.use('classic')

x1 = [0.632455532, 0.178885438, 0.050596443, 0.014310835, 0.004047715]
y1 = [114.75, 127.5, 139.0625, 147.9492188, 153.8085938]
plt.plot(x1, y1)

enter image description here

x2 = [0.707, 0.2, 0.057, 0.016, 0.00453]
y2 = [2.086, 7.525, 26.59375,87.03125, 375.9765625]
plt.plot(x2, y2)

enter image description here

This is definitely not linear. If you know what sort of function this follows, you may want to use scipy's curve fitting to get a best-fit function which you can then use.

Edit:

If we convert the plots to log-log,

import numpy as np

plt.plot(np.log(x1), np.log(y1))

enter image description here

plt.plot(np.log(x2), np.log(y2))

enter image description here

they look pretty linear (if you squint a bit). Finding a best-fit line,

np.polyfit(np.log(x1), np.log(y1), 1)
# array([-0.05817402,  4.73809081])

np.polyfit(np.log(x2), np.log(y2), 1)
# array([-1.01664659,  0.36759068])

we can convert back to functions,

# f1:
# log(y) = -0.05817402 * log(x) + 4.73809081
# so
# y = (e ** 4.73809081) * x ** (-0.05817402)
def f1(x):
    return np.e ** 4.73809081 * x ** (-0.05817402)

xs = np.linspace(0.01, 0.8, 100)
plt.plot(x1, y1, xs, f1(xs))

enter image description here

# f2:
# log(y) = -1.01664659 * log(x) + 0.36759068
# so
# y = (e ** 0.36759068) * x ** (-1.01664659)
def f2(x):
    return np.e ** 0.36759068 * x ** (-1.01664659)

plt.plot(x2, y2, xs, f2(xs))

enter image description here

The second looks pretty darn good; the first still needs a bit of refinement (ie find a more representative function and curve-fit it). But you should have a pretty good picture of the process ;-)

Upvotes: 5

pault
pault

Reputation: 43504

Here's some example code that can hopefully help you get started on building a linear model for your purposes.

import numpy as np
from sklearn.linear_model import LinearRegression
from matplotlib import pyplot as plt

# sample data
x = [0.632455532, 0.178885438, 0.050596443, 0.014310835, 0.004047715]
y = [114.75, 127.5, 139.0625, 147.9492188, 153.8085938]

# linear model
lm = LinearRegression()
lm.fit(np.array(x).reshape(-1, 1), y)

test_x = np.linspace(0.01, 0.7, 100)
test_y = [lm.predict(xx) for xx in test_x]

## try linear model with log(x)
lm2 = LinearRegression()
lm2.fit(np.log(np.array(x)).reshape(-1, 1), y)

test_y2 = [lm2.predict(np.log(xx)) for xx in test_x]

# plot
plt.figure()
plt.plot(x, y, label='Given Data')
plt.plot(test_x, test_y, label='Linear Model')
plt.plot(test_x, test_y2, label='Log-Linear Model')
plt.legend()

Which produces the following:

Model Comparison

As the @Hugh Bothwell showed, the values you gave did not have a linear relationship. However, taking the log of x seems to produce a better fit.

Upvotes: 2

Related Questions