Reputation: 106
How would I plot a linear regression with dates in pyplot? I wasn't able to find a definitive answer to this question. This is what I've tried (courtesy of w3school's tutorial on linear regression).
import matplotlib.pyplot as plt
from scipy import stats
x = ['01/01/2019', '01/02/2019', '01/03/2019', '01/04/2019', '01/05/2019', '01/06/2019', '01/07/2019', '01/08/2019', '01/09/2019', '01/10/2019', '01/11/2019', '01/12/2019', '01/01/2020']
y = [12050, 17044, 14066, 16900, 19979, 17593, 14058, 16003, 15095, 12785, 12886, 20008]
slope, intercept, r, p, std_err = stats.linregress(x, y)
def myfunc(x):
return slope * x + intercept
mymodel = list(map(myfunc, x))
plt.scatter(x, y)
plt.plot(x, mymodel)
plt.show()
Upvotes: 4
Views: 1817
Reputation: 40747
You first have to convert your dates into numbers to be able to do a regression (and to plot for that matter). Then you can instruct matplotlib to interpret the x-values as dates to get a nicely formatted axis:
import matplotlib.pyplot as plt
from scipy import stats
import datetime
x = ['01/01/2019', '01/02/2019', '01/03/2019', '01/04/2019', '01/05/2019', '01/06/2019', '01/07/2019', '01/08/2019', '01/09/2019', '01/10/2019', '01/11/2019', '01/12/2019']
y = [12050, 17044, 14066, 16900, 19979, 17593, 14058, 16003, 15095, 12785, 12886, 20008]
# convert the dates to a number, using the datetime module
x = [datetime.datetime.strptime(i, '%M/%d/%Y').toordinal() for i in x]
slope, intercept, r, p, std_err = stats.linregress(x, y)
def myfunc(x):
return slope * x + intercept
mymodel = list(map(myfunc, x))
fig, ax = plt.subplots()
ax.scatter(x, y)
ax.plot(x, mymodel)
# instruct matplotlib on how to convert the numbers back into dates for the x-axis
l = matplotlib.dates.AutoDateLocator()
f = matplotlib.dates.AutoDateFormatter(l)
ax.xaxis.set_major_locator(l)
ax.xaxis.set_major_formatter(f)
plt.show()
Upvotes: 6