When training a single batch, is iteration of examples necessary (optimal) in python code?

Question

Say I have one batch that I want to train my model on. Do I simply run tf.Session()'s sess.run(batch) once, or do I have to iterate through all of the batch's examples with a loop in the session? I'm looking for the optimal way to iterate/update the training ops, such as loss. I thought tensorflow would handle it itself, especially in the cases where tf.nn.dynamic_rnn() takes in a batch dimension for listing the examples. I thought, perhaps naively, that a for loop in the python code would be the inefficient method of updating the loss. I am using tf.losses.mean_squared_error(batch) for a regression problem.

My regression problem is given two lists of word vectors (300d each), and determines the similarity between the two lists on a continuous scale from [0, 5]. My supervised model is Deepmind's Differential Neural Computer (DNC). The problem is I do not believe it is learning anything. this is due to the fact that the all of the output from the model is centered around 0 and even negative. I do not know how it could possibly be negative given no negative labels provided. I only call sess.run(loss) for the single batch, I do not create a python loop to iterate through it.

So, what is the most efficient way to iterate the training of a model and how do people go about it? Do they really use python loops to do multiple calls to sess.run(loss) (this was done in the training file example for DNC, and I have seen it in other examples as well). I am certain I get the final loss from the below process, but I am uncertain if the model has actually been trained entirely just because the loss was processed in one go. I also do not understand the point of update_ops returned by some functions, and am uncertain if they are necessary to ensure the model has been trained.

Example of what I mean by processing a batch's loss once:

# assume the model has been defined prior through batch_output_logits
train_loss = tf.losses.mean_squared_error(labels=target,
                                    predictions=batch_output_logits)

with tf.Session() as sess:
    sess.run(init_op) # pseudo code, unnecessary for question
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    # is this the entire batch's loss && model  has been trained for that batch?
    loss_np = sess.run(train_step, train_loss)

    coord.request_stop()
    coord.join(threads)

Any input on why I am receiving negative values when the labels are in the range [0, 5] is welcomed as well(general abstract answers for this are fine, because its not the main focus). I am thinking of attempting to create a piece-wise function, if possible, for my loss, so that for any values out of bounds face a rapidly growing exponential loss function. Uncertain how to implement, or if it would even work.

Code is currently private. Once allowed, I will make the repo public.

To run DNC model, go to the project/ directory and run python -m src.main. If there are errors you encounter feel free to let me know.

This model depends upon Tensorflow r1.2, most recent Sonnet, and NLTK's punkt for Tokenizing sentences in sts_handler.py and tests/*.

When training a single batch, is iteration of examples necessary (optimal) in python code?

Answers (1)

Related Questions