ashmathi
ashmathi

Reputation: 13

How to plot lines between datapoints and the Line of best fit?

I want to plot the lines between the data points on my linear regression and the line of best fit. Looking to create the grey lines in the attached image.

Image attached here

Upvotes: 1

Views: 1598

Answers (1)

mozway
mozway

Reputation: 262484

Here is a minimal working example.

In summary, the points are a scatter, the fit line is computed with numpy.polyfit, and the vertical segments are a LineCollection.

import numpy as np
import matplotlib.pyplot as plt

# random data
n = 100
np.random.seed(0)
X = np.random.uniform(0,10,size=n)

noise = np.random.normal(0,2,size=n)
Y = X*2+1+noise

# setting up plot
ax = plt.subplot()
ax.scatter(X, Y, c='r', zorder=2)

# computing and plotting fit line
a, b = np.polyfit(X, Y, 1)

xs = np.linspace(0,10)
ax.plot(xs, a*xs+b, c='k', lw=2)

# computing and plotting grey lines
from matplotlib import collections  as mc
lines = [[(i,j), (i,i*a+b)] for i,j in zip(X,Y)]
lc = mc.LineCollection(lines, colors='grey', linewidths=1, zorder=1)
ax.add_collection(lc)

Output:

scatter with fit line and vertical lines

Upvotes: 1

Related Questions