Reputation: 2309
I want to use curve_fit to do exponential regression, so I can calculate the yearly growth rate of my data.
x is the year, y is assets. If it could fit, the growth-rate should be exp(b) - 1
. But it returns (1.0, 1.0, 1.0)
as popt.
import pandas as pd
import numpy as np
import scipy.stats as stats
from datetime import datetime
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
df = pd.DataFrame([
[2011, 255],
[2012, 349],
[2013, 449],
[2014, 554],
[2015, 658]
], columns=['x', 'y'])
def func(x, a, b, c):
return a * np.exp(b * x + c)
popt, pcov = curve_fit(func, df['x'], df['y'])
print('popt', popt)
df['y1'] = func(df['x'], *popt)
print('df\n', df)
plt.plot(df['x'], df['y'])
plt.plot(df['x'], df['y1'], 'g--',
label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))
plt.xlabel('x')
plt.xlabel('y')
plt.legend()
plt.show()
Output:
popt [1.0 1.0 1.0]
df
x y y1
0 2011 255 inf
1 2012 349 inf
2 2013 449 inf
3 2014 554 inf
4 2015 658 inf
If I change x
to [1, 2, 3, 4, 5], it could fit.
Upvotes: 0
Views: 176
Reputation: 7157
You are encountering an overflow, since your x
values are just too large to fit np.exp(x)
into a 64 bit floating point number, see np.exp(2015)
. One way to handle this, is to fit the function g(x) = func(x-2011) instead:
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
x = np.array([2011, 2012, 2013, 2014, 2015])
y = np.array([255, 349, 449, 554, 658])
def func(x, a, b, c):
return a * np.exp(b * x + c)
def g(x, a, b, c):
return func(x-2011, a, b, c)
popt, pcov = curve_fit(g, x, y)
plt.plot(x, g(x, *popt))
plt.show()
Upvotes: 2