Nils Cao
Nils Cao

Reputation: 1419

Is it possible to calculate two kinds of gradient separately in tensorflow

w_1   = tf.get_variable("w_1", shape)   
w_2   = tf.get_variable("w_2", shape) 
output = tf.mul(w_1, w_2)
.....
.....
optimizer = tf.train.AdamOptimizer(alpha).minimize(self.cost)

As we know, when we run "optimizer", tensorflow will caculate gradient and update w_1 & w_2.

But what i want to do is, first, I want to treat w_1 as a constant, I just want to caculate gradient and update only w_2. Second, treat w_2 as a constant and caculate gradient and update only w_1. I want to take turns to do these things.

Actually, I have seen this before: enter link description here. But I use BasicLSTMCell module. I try this code: print (tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)), it shows there are four kinds of parameter in my neural network, which means besides w_1 and w_2, there are other two parameters in BasicLSTMCell. So, if I use such as var_list=[w_1], the other two parameters in BasicLSTMCell can not be optimized, How can I do it?

Upvotes: 1

Views: 1195

Answers (2)

mrry
mrry

Reputation: 126154

It's tricky, but possible, to do what you want. The trick is to define a var_list by excluding w_1 (or w_2) from the list when you want to hold it constant. For example, you could use a list comprehension to match variables based on their (unique) names, as follows:

w_1 = tf.get_variable("w_1", shape)   
w_2 = tf.get_variable("w_2", shape) 
output = tf.mul(w_1, w_2)

variables_without_w_1 = [v for v in tf.trainable_variables() if v.name != w_1.name]
variables_without_w_2 = [v for v in tf.trainable_variables() if v.name != w_2.name]

optimizer_without_w_1 = tf.train.AdamOptimizer(alpha).minimize(
    self.cost, var_list=variables_without_w_1)
optimizer_without_w_2 = tf.train.AdamOptimizer(alpha).minimize(
    self.cost, var_list=variables_without_w_2)

Upvotes: 1

Vincent Vanhoucke
Vincent Vanhoucke

Reputation: 666

It is possible, even likely, that BasicLSTMCell gives you access to its internal collection of Variables in some way, which you could pass to var_list. But a more general way could also be to get the gradients from the optimizer interface directly:

optimizer = tf.train.AdamOptimizer(alpha)

grads_and_vars = optimizer.compute_gradients(self.cost)

grads_and_vars is a list of tuples (gradient, variable). You can filter out the ones you want to keep fixed, and then apply the rest:

optimizer.apply_gradients(filtered_grads_and_vars)

Upvotes: 1

Related Questions