Reputation: 24616
I'm trying to create a minimal code snippet to understand the GradientDescentOptimizer
class to help me to understand the tensorflow API docs in more depth.
I would like to provide some hardcoded inputs to the GradientDescentOptimizer
, run the minimize()
method and inspect the output. So far, I have created the following:
loss_data = tf.Variable([2.0], dtype=tf.float32)
train_data = tf.Variable([20.0], dtype=tf.float32, name='train')
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.02)
gradients = optimizer.compute_gradients(loss_data, var_list=[train_data])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(gradients))
The error I get is:
TypeError: Fetch argument None has invalid type <class 'NoneType'>
However, I'm just guessing what the inputs could look like. Any pointers appreciated to help me understand what I should be passing to this function.
Some more context ...
I followed a similar process to understand activation functions by isolating them and treating them as a black-box where I send a range of inputs and inspect the corresponding outputs.
# I could have used a list of values but I wanted to experiment
# by passing in one parameter value at a time.
placeholder = tf.placeholder(dtype=tf.float32, shape=[1], name='placeholder')
activated = tf.nn.sigmoid(placeholder)
with tf.Session() as sess:
x_y = {}
for x in range(-10, 10):
x_y[x] = sess.run(activated, feed_dict={ placeholder: [x/1.0]})
import matplotlib.pyplot as plt
%matplotlib inline
x, y = zip(*x_y.items())
plt.plot(x, y)
The above process was really useful for my understanding of activation functions and I was hoping to do something similar for optimizers.
Upvotes: 1
Views: 183
Reputation: 12938
See this discussion on GitHub. Basically, tf.compute_gradients
returns a gradient of None
for all variables not related to its input, and session.run
throws this error if it gets a None
value. Simple workaround; only tell it about variables that are relevant to the loss function you pass it:
gradients = optimizer.compute_gradients(loss_data, var_list=[loss_data])
Now when you run:
print(sess.run(gradients))
# [(gradient, input)]
# [(array([1.], dtype=float32), array([2.], dtype=float32))]
Which makes sense. You are calculating the gradient of x
with respect to x
, which is always 1.
For something more illustrative, let's define a loss
op that changes with an input variable:
x = tf.Variable([1], dtype=tf.float32, name='input_x')
loss = tf.abs(x)
Your optimizer is defined the same, but your gradient
op is now computed on this loss
op:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.02)
gradients = optimizer.compute_gradients(loss, [x])
Finally, we can run this in a loop with different inputs and see how the gradient changes:
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(-3, 3):
grad = sess.run(gradients, feed_dict={x: (i,)})
print(grad)
# [(gradient, input)]
# [(array([-1.], dtype=float32), array([-3.], dtype=float32))]
# [(array([-1.], dtype=float32), array([-2.], dtype=float32))]
# [(array([-1.], dtype=float32), array([-1.], dtype=float32))]
# [(array([0.], dtype=float32), array([0.], dtype=float32))]
# [(array([1.], dtype=float32), array([1.], dtype=float32))]
# [(array([1.], dtype=float32), array([2.], dtype=float32))]
This follows, as loss = -x
if x
is negative, and loss = +x
if x
is positive, so d_loss/d_x
will be -1 for a negative number, +1 for a positive number, and zero if the input is zero.
Upvotes: 1
Reputation: 24651
Your loss shouldn't be a variable (it is not a parameter of your model) but the result of an operation, e.g.
loss_data = train_data**2
Currently your loss does not depend on train_data
which explains why no gradient can be computed.
Upvotes: 2