Back propagation vs Levenberg Marquardt

Does anyone know the difference between Backpropagation and Levenberg–Marquardt in neural networks training? Sometimes I see that LM is considered as a BP algorithm and sometimes I see the opposite. Your help will be highly appreciated.

Thank you.

Upvotes: 2

Answers (1)

Ash

Reputation: 4718

Those are two completely unrelated concepts.

Levenberg-Marquardt (LM) is an optimization method, while backprop is just the recursive application of the chain rule for derivatives.

What LM intuitively does is this: when it is far from a local minimum, it ignores the curvature of the loss and acts as gradient descent. However, as it gets closer to a local minimum it pays more and more attention to the curvature by switching from gradient descent to a Gauss-Newton like approach.

The LM method needs both the gradient and the Hessian (as it solves variants of (H+coeff*Identity)dx=-g with H,g respectively the Hessian and the gradient. You can obtain the gradient via backpropagation. For the Hessian, it is most often not as simple although in least squares you can approximate it as 2gg^T, which means that in that case you can also obtain it easily at the end of the initial backprop.

For neural networks LM usually isn't really useful as you can't construct such a huge Hessian, and even if you do, it lacks the sparse structure needed to invert it efficiently.

Upvotes: 2

Back propagation vs Levenberg Marquardt

Answers (1)

Related Questions