Mark
Mark

Reputation: 2148

Fit a logarithmic curve to data points and extrapolate out in numpy

I have a set of data points (x and y in the code). I would like to plot these points, and fit a curve to them that shows what value of x would be required to make y = 100.0 (y values are percentages). Here is what I have tried, but my curve is a polynomial of degree 3 (which I know is wrong). To me, the data looks logarithmic, but I do now know how to polyfit a logarithmic curve to my data.

import numpy as np
import matplotlib.pyplot as plt

x = np.array([4,8,15,29,58,116,231,462,924,1848])
y = np.array([1.05,2.11,3.95,7.37,13.88,25.46,43.03,64.28,81.97,87.43])

for x1, y1 in zip(x,y):
    plt.plot(x1, y1, 'ro')

z = np.polyfit(x, y, 3)
f = np.poly1d(z)

for x1 in np.linspace(0, 1848, 110):
    plt.plot(x1, f(x1), 'b+')

plt.show()

This is what I get so far

Upvotes: 3

Views: 7348

Answers (3)

Brenlla
Brenlla

Reputation: 1481

It looks like a binding curve:

def binding(x,kd,bmax):
    return (bmax*x)/(x+kd)
param=sp.optimize.curve_fit(binding, x,y)

plt.plot(x,y,'o',np.arange(2000),binding(np.arange(2000),*param[0]))

In which case, strictly speaking, y=100% will only happen at x=inf

Upvotes: 4

Matteo Vilucchio
Matteo Vilucchio

Reputation: 51

The way I solve those kind of problems is by using scipy.optimize.curve_fit. It is a function you have to import from, of course, scipy.optimize.

The function takes as first argument one function you that you define with def f( x, a, b ). The function must take as first argument the independent variable and all the other arguments should be the parameters for the function.
Then the .curve_fit() takes the x-data and then y-data ( the numpy 1-D arrays are good ). It returns an array with the best fit parameters. In the end you should have something like this.

import numpy as np
from scipy.optimize import curve_fit

def l( x, a, b, c, d ):
    return a*np.log( b*x + c ) + d

param = curve_fit( l, x, y )

Upvotes: 3

tel
tel

Reputation: 13999

You actually don't need to use any fitting functions from Numpy or Scipy, since there's a "simple" closed form formula for finding the least-squares fit to a logarithmic curve. Here's an implementation in Python:

def logFit(x,y):
    # cache some frequently reused terms
    sumy = np.sum(y)
    sumlogx = np.sum(np.log(x))

    b = (x.size*np.sum(y*np.log(x)) - sumy*sumlogx)/(x.size*np.sum(np.log(x)**2) - sumlogx**2)
    a = (sumy - b*sumlogx)/x.size

    return a,b

You could then apply it to your problem as so:

x = np.array([4,8,15,29,58,116,231,462,924,1848])
y = np.array([1.05,2.11,3.95,7.37,13.88,25.46,43.03,64.28,81.97,87.43])

def logFunc(x, a, b):
    return a + b*np.log(x)

plt.plot(x, y, ls="none", marker='.')

xfit = np.linspace(0,2000,num=200)
plt.plot(xfit, logFunc(xfit, *logFit(x,y)))

I don't think your data is logarithmic, though:

enter image description here

Upvotes: 7

Related Questions