user7693832
user7693832

Reputation: 6849

Why there divided by 2 in cost function derivation process ?

In Linear Regression With One Variable | CostFunction — Andrew Ng video 5:17,

the cost function derivation process, I understand this step:

enter image description here

but why there(5:17) comes :

enter image description here

Upvotes: 0

Views: 2060

Answers (2)

devReddit
devReddit

Reputation: 2947

When you're calculating cost function, you're trying to get mean square deviation (MSD). If you don't divide by m, it's not really the mean square value, it's basically sum of deviations.

And the half, it's nothing but taking halves of MSD, can be called half-MSD. When you take the derivative of the cost function, that is used in updating the parameters during gradient descent, that 2 in the power get cancelled with the 1/2 multiplier, thus the derivation is cleaner. These techniques are or somewhat similar are widely used in math in order "To make the derivations mathematically more convenient".

Upvotes: 2

jhso
jhso

Reputation: 3283

IIRC it's so when you take the derivative you don't need to scale it. It doesn't really make a difference as the loss will be mathematically equivalent, see: https://stats.stackexchange.com/a/313172

Upvotes: 0

Related Questions