Ujjwal
Ujjwal

Reputation: 1638

LookUpError in TensorFlow with tf.cond()

Work environment

Problem Description

I use tf.cond() to move between training and validation datasets at the time of processing. The following snippet shows how I have done :

with tf.variable_scope(tf.get_variable_scope()) as vscope:
        for i in range(4):
            with tf.device('/gpu:%d'%i):
                with tf.name_scope('GPU-Tower-%d'%i) as scope:
                    worktype = tf.get_variable("wt",[], initializer=tf.zeros_initializer())
                    worktype = tf.assign(worktype, 1)
                    workcondition = tf.equal(worktype, 1)
                    elem = tf.cond(workcondition, lambda: train_iterator.get_next(), lambda: val_iterato\
r.get_next())
                    net =  vgg16cnn2(elem[0],numclasses=256)
                    img = elem[0]
                    centropy  = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=ele\
m[1],logits= net))
                    reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES, scope)
                    regloss = 0.05 * tf.reduce_sum(reg_losses)
                    total_loss = centropy + regloss
                    t1 = tf.summary.scalar("Training Batch Loss", total_loss)
                    tf.get_variable_scope().reuse_variables()
                    predictions = tf.cast(tf.argmax(tf.nn.softmax(net), 1), tf.int32)
                    correct_predictions = tf.cast(tf.equal(predictions, elem[1]), tf.float32)
                    batch_accuracy = tf.reduce_mean(correct_predictions)
                    t2 = tf.summary.scalar("Training Batch Accuracy", batch_accuracy)
                    correct_detection.append(correct_predictions)
                    grads = optim.compute_gradients(total_loss)

So basically based on the value of worktype, a minibatch will be taken from training or validation set.

When I run this code, I get the following LookUp Error :

LookupError: No gradient defined for operation 'GPU-Tower-0/cond/IteratorGetNext_1' (op type: IteratorGetNext)

Why does TensorFlow think that IteratorGetNext_1 requires a gradient ? How can I remedy this ?

Upvotes: 0

Views: 312

Answers (1)

chrert
chrert

Reputation: 462

The variable worktype is marked as trainable. By default, Optimizer.compute_gradients(...) computes the gradients for all trainable variables.

There are two ways you could solve this:

  1. Set trainable=False in tf.get_variable(...).
  2. Explicitly specify the variables for which the gradients should be computed with the var_list argument of Optimizer.compute_gradients(...).

Upvotes: 1

Related Questions