Russell Maytham
Russell Maytham

Reputation: 137

Tensorflow error when training: Caused by op 'shuffle_batch'

I am trying to read images and labels from a TFRecord file, and then train with these. I know that my TFRecord file exists, and have checked that it does contain 1000 images and labels. My problem only seems to arise when I want to pipe as input to train. I am new to python and tensor flow, and not sure how to fix the problem

I get the following error occuring at tf.train.shuffle_batch

...

Caused by op 'shuffle_batch', defined at: File "C:/AI/projects/DataGen/train.py", line 40, in images_batch, labels_batch = tf.train.shuffle_batch([image, label], batch_size=10, capacity=1000,min_after_dequeue=2)

...

Here is my code, cobbled together from various mnist examples

import tensorflow as tf


def read_and_decode_single_example(filename):
    # first construct a queue containing a list of filenames.
    # this lets a user split up there dataset in multiple files to keep
    # size down
    filename_queue = tf.train.string_input_producer([filename],
                                                num_epochs=None)
    # Unlike the TFRecordWriter, the TFRecordReader is symbolic
    reader = tf.TFRecordReader()
    # One can read a single serialized example from a filename
    # serialized_example is a Tensor of type string.
    _, serialized_example = reader.read(filename_queue)
    # The serialized example is converted back to actual values.
    # One needs to describe the format of the objects to be returned

    feature = {'image': tf.FixedLenFeature([], tf.string),
           'label': tf.FixedLenFeature([], tf.int64)}

    features = tf.parse_single_example(serialized_example, features=feature)

    # now return the converted data
    label = tf.cast(features['label'], tf.float32)
    image = tf.decode_raw(features['image'], tf.float32)
    image = tf.reshape(image, [28, 28, 3])
    return label, image


with tf.Session() as sess:
    sess.run(tf.local_variables_initializer())
    sess.run(tf.global_variables_initializer())

    # get single examples
    label, image = read_and_decode_single_example("train.tfrecords")

    image = tf.cast(image, tf.float32) / 255.

    # groups examples into batches randomly
    images_batch, labels_batch = tf.train.shuffle_batch([image, label], batch_size=10, capacity=1000, min_after_dequeue=2)

    # The model is:
    #
    # Y = softmax( X * W + b)
    #              X: matrix for rgb images of 28x28 pixels, flattened (there are 100 images in a mini-batch)
    #              W: weight matrix with (28x28x3) lines and 10 columns
    #              b: bias vector with 10 dimensions
    #              +: add with broadcasting: adds the vector to each line of the matrix (numpy)
    #              softmax(matrix) applies softmax on each line
    #              softmax(line) applies an exp to each value then divides by the norm of the resulting line
    #              Y: output matrix with 100 lines and 10 columns

    # input X: 28x28x3 RGB images
    X = images_batch
    # correct answers will go here
    Y_ = labels_batch
    # weights W[28 * 28 * 3, 10]
    W = tf.Variable(tf.zeros([28 * 28 * 3, 10]))
    # biases b[10]
    b = tf.Variable(tf.zeros([10]))

    # flatten the images into a single line of pixels
    # -1 in the shape definition means "the only possible dimension that will preserve the number of elements"
    XX = tf.reshape(X, [-1, 28 * 28 * 3])

    # The model
    Y = tf.nn.softmax(tf.matmul(XX, W) + b)

    # loss function: cross-entropy = - sum( Y_i * log(Yi) )
    #                           Y: the computed output vector
    #                           Y_: the desired output vector

    # cross-entropy
    # log takes the log of each element, * multiplies the tensors element by element
    # reduce_mean will add all the components in the tensor
    # so here we end up with the total cross-entropy for all images in the batch
    cross_entropy = -tf.reduce_mean(Y_ * tf.log(Y)) * 100.0  # normalized for batches of 100 images,
    # *10 because  "mean" included an unwanted division by 10

    # accuracy of the trained model, between 0 (worst) and 1 (best)
    correct_prediction = tf.equal(tf.argmax(Y, 1), tf.argmax(Y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    # training, learning rate = 0.005
    train_step = tf.train.GradientDescentOptimizer(0.005).minimize(cross_entropy)

    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    for i in range(100 + 1):
        print(i)
        sess.run(train_step)

    coord.request_stop()

    # Wait for threads to stop
    coord.join(threads)
    sess.close()

Upvotes: 0

Views: 684

Answers (1)

Russell Maytham
Russell Maytham

Reputation: 137

I moved the initialization to just before the tf.train.start_queue_runners call and that solved the problem i.e. after the model is setup

sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)

Upvotes: 1

Related Questions