How do you calculate the training error and validation error of a linear regression model?

Question

I have a linear regression model and my cost function is a Sum of Squares Error function. I've split my full dataset into three datasets, training, validation, and test. I am not sure how to calculate the training error and validation error (and the difference between the two).

Is the training error the Residual Sum of Squares error calculated using the training dataset?

An example of what I'm asking: So if I was doing this in Python, and let's say I had 90 data-points in the training data set, then is this the correct code for the training error?

y_predicted = f(X_train, theta) #predicted y-value at point x, where y_train is the actual y-value at x
training_error = 0
for i in range(90):
  out = y_predicted[i] - y_train[i] 
  out = out*out 
  training_error+=out

training_error = training_error/2
print('The training error for this regression model is:', training_error)

How do you calculate the training error and validation error of a linear regression model?

Answers (1)

Related Questions