Reputation: 367
I want to use the AdamOptimizer, but I also want to edit my gradients every step.
The typical usage is as follows:
train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
sess.run(train_step, feed_dict=feed_dict)
This applies a single training step with the AdamOptimizer.
I want to modify the gradients every step, so I extract them and them reinsert them with the following code:
opt = tf.train.AdamOptimizer(learning_rate=1e-3)
grads_and_vars = opt.compute_gradients(loss)
train_opt = opt.apply_gradients(grads_and_vars)
sess.run(train_opt, feed_dict=feed_dict)
I would normally apply some operations to grads_and_vars
, but I'm just trying to get this to work first. The previous code fails at sess.run(train_opt, feed_dict=feed_dict)
because of the following error:
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value beta1_power_1
[[Node: beta1_power_1/read = Identity[T=DT_FLOAT, _class=["loc:@Variable"], _device="/job:localhost/replica:0/task:0/cpu:0"](beta1_power_1)]]
which is caused by train_opt = opt.apply_gradients(grads_and_vars)
, but am I not applying the gradients correctly?
There is no error with the GradientDescentOptimizer, so I know this must be the right way to extract the gradients and then reinsert them for a training step.
Is there something I'm missing? How can I use the AdamOptimizer this way?
EDIT: I mentioned that the second code block works with GradientDescentOptimizer, but it is about 10 times slower than the first code. Is there a way to speed that up?
Upvotes: 0
Views: 922
Reputation: 1318
run this sess.run(tf.local_variables_initializer())
, there are local variables in adam, you need to initialize them
Upvotes: 1