Reputation: 171
I have run a KNN model. Now i want to plot the residual vs predicted value plot. Every example from different websites shows that i have to first run a linear regression model. But i couldn't understand how to do this. Can anyone help? Thanks in advance. Here is my model-
train, validate, test = np.split(df.sample(frac=1), [int(.6*len(df)), int(.8*len(df))])
x_train = train.iloc[:,[2,5]].values
y_train = train.iloc[:,4].values
x_validate = validate.iloc[:,[2,5]].values
y_validate = validate.iloc[:,4].values
x_test = test.iloc[:,[2,5]].values
y_test = test.iloc[:,4].values
clf=neighbors.KNeighborsRegressor(n_neighbors = 6)
clf.fit(x_train, y_train)
y_pred = clf.predict(x_validate)
Upvotes: 12
Views: 48231
Reputation: 199
Residuals are nothing but how much your predicted values differ from actual values. So, it's calculated as actual values-predicted values. In your case, it's residuals = y_test-y_pred. Now for the plot, just use this;
import matplotlib.pyplot as plt
plt.scatter(residuals,y_pred)
plt.show()
Upvotes: 9
Reputation: 4864
What is the question? The residuals are simply y_test-y_pred
. Now use seaborn's regplot.
Upvotes: 3