Yingchao Xiong
Yingchao Xiong

Reputation: 255

optimise Tensorflow learning rate

How could I find the best learning rate and decay rate dynamically?

The function like tf.train.exponential_decay cannot be changed dynamically based on the different cases since the starting rate and decay rate are pre-defined.

Upvotes: 1

Views: 643

Answers (1)

Yaroslav Bulatov
Yaroslav Bulatov

Reputation: 57983

This is an open research problem, but on large batches backtracking line-search can be useful.

Note that your loss function is approximately linear for small enough neighborhood, so if you take small enough steps, you can predict what your loss decrease would be.

So the idea is that you look at predicted decrease in loss against actual decrease. If it's too close, you were too conservative and you increase your step size. If it's too far, do the opposite.

There's no built-in primitive to do this in TensorFlow, but you can implement it using lower level ops. Here's an end-to-end example on MNIST autoencoder: https://github.com/yaroslavvb/stuff/tree/master/line_search_example

The learning rate quickly goes up to 0.05, and then after you've converged it can't make progress, so it drops to zero.

enter image description here

Upvotes: 3

Related Questions