one
one

Reputation: 2585

variables still update even if learning rate is set to 0

I train my model with tensorflow, after several iterations, the model output became Nan. The I set the lr=0 , I think the model weight will not update, however, after several iterations I still got Nan. When I just load the data, print the output, totally cut the optimization process, I will not get Nan.

So I am quite curious why model still update when the lr=0.

I am using TF1.3. python2.7

I have tried tf.train.GradientDescentOptimizer and tf.train.AdamOptimizer

Upvotes: 0

Views: 273

Answers (2)

one
one

Reputation: 2585

Patwiw is right!

In fact, it is because my code use tf.log to a incorrect truth data which results in the -inf loss.

Upvotes: 0

Patwie
Patwie

Reputation: 4460

And your model will not update:

import tensorflow as tf

w = tf.get_variable('w', initializer=42.)
cost_op = tf.square(w)
train_op = tf.train.GradientDescentOptimizer(0.0).minimize(cost_op)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(10):
        _, cost, value = sess.run([train_op, cost_op, w])
        print(i, cost, value)

gives

(0, 1764.0, 42.0)
(1, 1764.0, 42.0)
(2, 1764.0, 42.0)
(3, 1764.0, 42.0)
(4, 1764.0, 42.0)
(5, 1764.0, 42.0)
(6, 1764.0, 42.0)
(7, 1764.0, 42.0)
(8, 1764.0, 42.0)
(9, 1764.0, 42.0)

for both AdamOptimizer and GradientDescentOptimizer. My best guess is a non-gradient update like BatchNorm and/or NaN in your data causing NaN. Or even a wrong operation.

How do you expect to get help without showing your implementation in a [mcve]?

Upvotes: 1

Related Questions