buydadip
buydadip

Reputation: 9437

TensorFlow variable configuration

I successfully implemented a feed-forward algorithm in TensorFlow that looked as follows...

mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784])  # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10])  # 0-9 digits recognition => 10 classes

# set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits)  # Softmax

# minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)

# initializing the variables
init = tf.global_variables_initializer()

...and the training cycle was as follows...

# launch the graph
with tf.Session() as sess:

    sess.run(init)

    # training cycle
    for epoch in range(FLAGS.training_epochs):
        avg_cost = 0
        total_batch = int(mnist.train.num_examples/FLAGS.batch_size)
        # loop over all batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)

            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})

...the rest of the code is not necessary. Up until this point the code works perfect. It is important to note that my batch_size is 100. The problem is I am using tf.placeholder for my values but in fact I need to change them to use tf.get_variable. The first thing I did was change the following...

# tf Graph Input
x = tf.get_variable("input_image", shape=[100,784], dtype=tf.float32)
y = tf.placeholder(shape=[100,10], name='input_label', dtype=tf.float32)  # 0-9 digits recognition => 10 classes


# set model weights
W = tf.get_variable("weights", shape=[784, 10], dtype=tf.float32, initializer=tf.random_normal_initializer())
b = tf.get_variable("biases", shape=[1, 10], dtype=tf.float32, initializer=tf.zeros_initializer())

# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits)  # Softmax

# minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate).minimize(cost)

# initializing the variables
init = tf.global_variables_initializer()

...so far so good. But now I am trying to implement the training cycle and this is where I run into issues. I run the exact same training cycle as above with batch_size = 100 and I get the following errors...

tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node GradientDescent/update_input_image/ApplyGradientDescent was passed float from _recv_input_image_0:0 incompatible with expected float_ref.

How can I fix this issue? The error is coming from the following line...

_, c = sess.run([optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})

Upvotes: 0

Views: 144

Answers (1)

mrry
mrry

Reputation: 126184

It's unclear to me why you needed to change x to a tf.Variable when you are continuing to feed a value for it. There are two workarounds (not counting the case where you could just revert x to being tf.placeholder() as in the working code):

  1. The error is being raised because the optimizer is attempting to apply an SGD update to the value that you're feeding (which leads to a confusing runtime type error). You could prevent optimizer from doing this by passing trainable=False when constructing x:

    x = tf.get_variable("input_image", shape=[100, 784], dtype=tf.float32,
                        trainable=False)
    
  2. Since x is a variable, you could assign the image to the variable in a separate step before running the optimizer.

    x = tf.get_variable("input_image", shape=[100, 784], dtype=tf.float32)
    x_placeholder = tf.placeholder(tf.float32, shape=[100, 784])
    assign_x_op = x.assign(x_placeholder).op
    
    # ...
    
    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(FLAGS.batch_size)
    
        # Assign the contents of `batch_xs` to variable `x`.
        sess.run(assign_x_op, feed_dict={x_placeholder: batch_xs})
    
        # N.B. Now you do not need to feed `x`.
        _, c = sess.run([optimizer, cost], feed_dict={y: batch_ys})
    

    This latter version would allow you to perform gradient descent on the contents of the image (which might be why you'd want to store it in a variable in the first place).

Upvotes: 1

Related Questions