Reputation: 3148
I'm trying to made function that will calculate mean squared error from y (true values) and y_pred (predicted ones) not using sklearn or other implementations.
I'll try next:
def mserror(y, y_pred):
i=0
for i in range (len(y)):
i+=1
mse = ((y - y_pred) ** 2).mean(y)
return mse
Can you please correct me what I m doing wrong with the calculation and who it can be fixed?
Upvotes: 8
Views: 51215
Reputation: 257
Here's how to implement MSE in python:
def mse_metric(actual, predicted):
sum_error = 0.0
# loop over all values
for i in range(len(actual)):
# the error is the sum of (actual - prediction)^2
prediction_error = actual[i] - predicted[i]
sum_error += (prediction_error ** 2)
# now normalize
mean_error = sum_error / float(len(actual))
return (mean_error)
Upvotes: -2
Reputation: 101
I would say :
def get_mse(y, y_pred):
d1 = y - y_pred
mse = (1/N)*d1.dot(d1) # N is int(len(y))
return mse
it would only work if y and y_pred are numpy arrays, but you would want them to be numpy arrays as long as you decide not to use other libraries so you can do math operations on it.
numpy dot() function is the dot product of 2 numpy arrays (you can also write np.dot(d1, d1) )
Upvotes: 2
Reputation: 257
firstly, you are using the i repeatedly and increments it but in range it is automatically iterative to next number. So don't use i again. The other thing that you are taking the mean of y but instead of taking mean of this, take the mean of ((y - y_pred) ** 2). I hope, you got the point.
Upvotes: -2
Reputation: 3106
You are modifying the index for no reason. A for loop increments it anyways. Also, you are not using the index, for example, you are not using any y[i] - y_pred[i]
, hence you don't need the loop at all.
Use the arrays
mse = np.mean((y - y_pred)**2)
Upvotes: 22