TensorFlow: Does creating a near-replica of a graph as an extension to the original graph to test the validation dataset consume a lot of memory?

Question

Meaning to say if I have the following operations for training purposes in my graph initially:

with tf.Graph.as_default() as g:
  images, labels = load_batch(...)
  with slim.argscope(...):
    logits, end_points = inceptionResnetV2(images, num_classes..., is_training = True)

  loss = slim.losses.softmax_cross_entropy(logits, labels)

  optimizer = tf.train.AdamOptimizer(learning_rate = 0.002)

  train_op = slim.learning.create_train_op(loss, optimizer)

  sv = tf.train.Supervisor(...)

  with sv.managed_session() as sess:
    #perform your regular training loop here with sess.run(train_op)

Which allows me to train my model just fine, but I would like to run a small validation dataset that evaluates my model every once in a while inside my sess, would it take too much memory to consume a nearly exact replica within the same graph like:

images_val, labels_val = load_batch(...)
with slim.argscope(...):
  logits_val, end_points_val = inceptionResnetV2(images, num_classes..., is_training = False)

  predictions = end_points_val['Predictions']

  acc, acc_updates = tf.contrib.metrics.streaming_accuracy(predictions, labels_val)

  #and then following this, we can run acc_updates in a session to update the accuracy, which we can then print to monitor

My concern is that to evaluate my validation dataset, I need to set the is_training argument to False so that I can disable dropout. But will creating an entire inception-resnet-v2 model from scratch just for validation inside the same graph consume too much memory? Or should I just create an entirely new file that runs the validation on my own?

Ideally, I wanted to have 3 kinds of dataset - a training one, a small validation dataset to test during training, and a final evaluation dataset. This small validation dataset will help me see if my model is overfitting to the training data. If my proposed idea consumes too much memory, however, would it be equivalent to just occasionally monitor the training data score? Is there a better idea to test the validation set while training?

pltrdy · Accepted Answer

TensorFlow's devs thought about it and made variable ready to be shared. You can see here the doc.

Using scopes the right way make it possible to reuse some variable. One very good example (the context is language model but never mind) is TensorFlow PTB Word LM.

The global pseudo-code of this approach is something like:

class Model:
  def __init__(self, train=True, params):
  """ Build the model """

  tf.placeholder( ... ) 

  tf.get_variable( ...) 



def main(_):
  with tf.Graph.as_default() as g:
    with tf.name_scope("Train"):
      with tf.variable_scope("Model", reuse=None):
        train = Model(train=True, params ) 
    with tf.name_scope("Valid"):
      # Now reuse variables = no memory cost
      with tf.variable_scope("Model", reuse=True):
        # But you can set different parameters
        valid = Model(train=False, params)

    session = tf.Session
    ...

Thus you can share some variable without having the exact same model as the parameters may change the model itself.

Hope this helps
pltrdy

TensorFlow: Does creating a near-replica of a graph as an extension to the original graph to test the validation dataset consume a lot of memory?

Answers (1)

Related Questions