Kyrol
Kyrol

Reputation: 3607

Does make sense use dynamic learning rate in AdamOptimizer?

I'm dvelopping a convolutional neural network for images recognition based on three own classes. I built an AlexNet-based model to train. I'd like to know two things:

  1. AdamOptimizer performs a learning rate decay internally (from a fixed given value) or not ?
  2. In case of not, can I use tf.train.exponential_decay to perform decay ?

Small examples are apprecciated. Thanks

Upvotes: 9

Views: 4684

Answers (2)

Martin Thoma
Martin Thoma

Reputation: 136197

AdamOptimizer performs a learning rate decay internally (from a fixed given value) or not ?

Yes, Adam does perform a learning rate decay.

You should have a look at how Adam works:

D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, Dec. 2014. [Online]. Available: https://arxiv.org/abs/1412.6980

To sum it up: Adam is RMSProp with momentum and bias correction. A very nice explanation is here: http://sebastianruder.com/optimizing-gradient-descent/index.html#adam

Upvotes: 4

bodokaiser
bodokaiser

Reputation: 15742

As you can see in adam.py AdamOptimizer will adjust its learning rate.

The learning rate you pass to the constructor just gives the initial value to start with.

So yes, it does not make much sense to use exponential decay on AdamOptimizer but on gradient descent or momentum optimizer. See here for an example.

Upvotes: 11

Related Questions