Why there divided by 2 in cost function derivation process ？

Question

In Linear Regression With One Variable | CostFunction — Andrew Ng video 5:17,

the cost function derivation process, I understand this step:

but why there(5:17) comes :

devReddit · Accepted Answer

When you're calculating cost function, you're trying to get mean square deviation (MSD). If you don't divide by m, it's not really the mean square value, it's basically sum of deviations.

And the half, it's nothing but taking halves of MSD, can be called half-MSD. When you take the derivative of the cost function, that is used in updating the parameters during gradient descent, that 2 in the power get cancelled with the 1/2 multiplier, thus the derivation is cleaner. These techniques are or somewhat similar are widely used in math in order "To make the derivations mathematically more convenient".

Why there divided by 2 in cost function derivation process ？

Answers (2)

Related Questions