Setting constant learning rates in Pytorch

Question

I am optimizing lstm networks with pytorch using the Adam optimizer. I have the feeling that my learning rate is decaying too fast, but I am not even 100% sure if Adam does that, since I can't find good documentation. If Adam decays the learning rate by default, is there a way to turn this off and set a constant learning rate?

ForceBru · Accepted Answer

"I can't find good documentation" - you could read the original paper, for example. Also, the documentation is here: https://pytorch.org/docs/stable/generated/torch.optim.Adam.html.

If by "learning rate" you mean the lr parameter of torch.optim.Adam, then it remains constant - Adam itself doesn' modify it, in contrast to learning-rate schedulers. However, Adam applies extra scaling to the gradient, so the learning rate is applied to this transformation of the gradient, not the gradient itself. This can't be turned off because this is the essence of the algorithm. If you'd like to apply the learning rate directly to the gradient, use stochastic gradient descent.

Setting constant learning rates in Pytorch

Answers (1)

Related Questions