Reputation: 13
I want to plot the lines between the data points on my linear regression and the line of best fit. Looking to create the grey lines in the attached image.
Upvotes: 1
Views: 1598
Reputation: 262484
Here is a minimal working example.
In summary, the points are a scatter
, the fit line is computed with numpy.polyfit
, and the vertical segments are a LineCollection
.
import numpy as np
import matplotlib.pyplot as plt
# random data
n = 100
np.random.seed(0)
X = np.random.uniform(0,10,size=n)
noise = np.random.normal(0,2,size=n)
Y = X*2+1+noise
# setting up plot
ax = plt.subplot()
ax.scatter(X, Y, c='r', zorder=2)
# computing and plotting fit line
a, b = np.polyfit(X, Y, 1)
xs = np.linspace(0,10)
ax.plot(xs, a*xs+b, c='k', lw=2)
# computing and plotting grey lines
from matplotlib import collections as mc
lines = [[(i,j), (i,i*a+b)] for i,j in zip(X,Y)]
lc = mc.LineCollection(lines, colors='grey', linewidths=1, zorder=1)
ax.add_collection(lc)
Output:
Upvotes: 1