Reputation: 3607
I'm dvelopping a convolutional neural network for images recognition based on three own classes. I built an AlexNet-based model to train. I'd like to know two things:
tf.train.exponential_decay
to perform decay ?Small examples are apprecciated. Thanks
Upvotes: 9
Views: 4684
Reputation: 136197
AdamOptimizer performs a learning rate decay internally (from a fixed given value) or not ?
Yes, Adam does perform a learning rate decay.
You should have a look at how Adam works:
D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, Dec. 2014. [Online]. Available: https://arxiv.org/abs/1412.6980
To sum it up: Adam is RMSProp with momentum and bias correction. A very nice explanation is here: http://sebastianruder.com/optimizing-gradient-descent/index.html#adam
Upvotes: 4
Reputation: 15742
As you can see in adam.py AdamOptimizer
will adjust its learning rate.
The learning rate you pass to the constructor just gives the initial value to start with.
So yes, it does not make much sense to use exponential decay on AdamOptimizer
but on gradient descent or momentum optimizer. See here for an example.
Upvotes: 11