Reputation: 67
I've split my data into X_train, X_test, y_train, and y_test, and is trying to compare y_pred
(my model's prediction of the X_test
set) with the ground truth values, y_test
.
# Splitting data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
model = LinearRegression().fit(X_train,y_train)
# Trying to predict X_test with the model
y_pred = model.predict(X_test)
# How do I compare y_pred with the y_test?
print(model.score(X_test,y_test))
What do I put as parameters in the model.score( , )
to compare y_pred
with the y_test
?
Do I print the score of X_test
and y_test
? My code just doesn't seem right.
Upvotes: 0
Views: 1336
Reputation: 8122
LinearRegression.score
works in the way you called it: you pass in an X
and a corresponding y
, which it scores against a prediction it does not share with you.
Accordingly,I recommend not using model.score()
, because it's a bit of an opaque function. You never get to see the prediction, and you don't know what the metric is without referring to the docs (it depends on the model; in this case it's R2).
Better to make a prediction, import the metric you want, and compute it explicitly. For example to use mean squared error:
from sklearn.metrics import mean_squared_error
y_pred = model.predict(X_test)
mean_square_error(y_pred, y_test)
Upvotes: 2