Reputation: 6849
In Linear Regression With One Variable | CostFunction — Andrew Ng video 5:17,
the cost function derivation process, I understand this step:
but why there(5:17) comes :
Upvotes: 0
Views: 2060
Reputation: 2947
When you're calculating cost function, you're trying to get mean square deviation (MSD)
. If you don't divide by m
, it's not really the mean square value, it's basically sum of deviations
.
And the half, it's nothing but taking halves of MSD, can be called half-MSD. When you take the derivative of the cost function, that is used in updating the parameters during gradient descent, that 2 in the power get cancelled with the 1/2 multiplier, thus the derivation is cleaner. These techniques are or somewhat similar are widely used in math in order "To make the derivations mathematically more convenient".
Upvotes: 2
Reputation: 3283
IIRC it's so when you take the derivative you don't need to scale it. It doesn't really make a difference as the loss will be mathematically equivalent, see: https://stats.stackexchange.com/a/313172
Upvotes: 0