Optimizers in Tensorflow

Question

From various examples of Tensorflow (translation, ptb) it seems like that you need to explicitly change learning rate when using GradientDescentOptimizer. But is it the case while using some more 'sophisticated' techniques like Adagrad, Adadelta etc. Also when we continue training the model from a saved instance, are the past values used by these optimizers saved in the model file ?

Phillip Bock · Accepted Answer

It depends on the Optimizer you are using. Vanilla SGD needs (accepts) individual adaption of the learning rate. Some others do. Adadelta for example does not. (https://arxiv.org/abs/1212.5701)

So this depends not so much on Tensorflow but rather on the mathematical background of the optimizer you are using.

Furthermore: Yes, saving and restarting the training does not reset the learning rates, but continuous at the point saved.

Optimizers in Tensorflow

Answers (1)

Related Questions