SantoshGupta7
SantoshGupta7

Reputation: 6197

Tensorflow: Is the learning rate you set in Adam and Adagrad just the initial learning rate?

I'm reading this blog

https://smist08.wordpress.com/2016/10/04/the-road-to-tensorflow-part-10-more-on-optimization/

where it mentions all the tensorflow's learning rates

optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)

optimizer = tf.train.AdadeltaOptimizer(starter_learning_rate).minimize(loss)

optimizer = tf.train.AdagradOptimizer(starter_learning_rate).minimize(loss)     # promising

optimizer = tf.train.AdamOptimizer(starter_learning_rate).minimize(loss)      # promising

optimizer = tf.train.MomentumOptimizer(starter_learning_rate, 0.001).minimize(loss) # diverges

optimizer = tf.train.FtrlOptimizer(starter_learning_rate).minimize(loss)    # promising

optimizer = tf.train.RMSPropOptimizer(starter_learning_rate).minimize(loss)   # promising

It says that the learning rate you input is only the starter learning rate. Does that mean that if you change the learning rate in the middle of training, that change will have no effect because it's not using the starter learning rate anymore?

I tried looking at the API docs and it doesn't specify this.

Upvotes: 2

Views: 1520

Answers (1)

Sraw
Sraw

Reputation: 20206

A short answer:

Except for your first line, the rest ones are all adaptive gradient descent optimizers which means they will automatically adjust learning rate based on some conditions during every step. So the learning rate given by you is just used to initialize.

Take AdamOptimizer as an example, you can learn its detail in this article.

Upvotes: 3

Related Questions