Using queues in TensorFlow to load images and labels from text file

Question

I am trying to run a very simple neural network in TensorFlow which will learn to classify images. So far it is extremely simple, because I am still learning the framework.

So far I am struggling to load the data - my data is in TXT file. Every line contains ID of the photo and a binary number which is used as a label.

This is my code so far (I have stripped unrelated parts):

import tensorflow as tf

IMAGE_WIDTH = 240
IMAGE_HEIGHT = 180
NUMBER_OF_CHANNELS = 3
SOURCE_DIR = './data/'
TRAINING_IMAGES_DIR = SOURCE_DIR + 'train/'
LIST_FILE_NAME = 'list.txt'
BATCH_SIZE = 100
TRAINING_SET_SIZE = 15873

def create_photo_and_label_batches(source_directory):
  # read the list of photo IDs and labels
  photos_list = open(source_directory + LIST_FILE_NAME, 'r')
  filenames_list = []
  labels_list = []
  # get lists of photo file names and labels
  for line in photos_list:
    filenames_list.append(source_directory + line.split(',')[0] + '.jpg')
    labels_list.append([bool(line.split(',')[1])])
  # convert the lists to tensors
  filenames = tf.convert_to_tensor(filenames_list, dtype=tf.string)
  labels = tf.convert_to_tensor(labels_list, dtype=tf.bool)
  # create queue with filenames and labels
  file_names_queue, labels_queue = 
     tf.train.slice_input_producer([filenames, labels], num_epochs=1, shuffle=True)
  # convert filenames of photos to input vectors
  photos_queue = tf.read_file(file_names_queue)  # convert filenames to content
  photos_queue = tf.image.decode_jpeg(photos_queue, channels=NUMBER_OF_CHANNELS)
  photos_queue.set_shape([IMAGE_HEIGHT, IMAGE_WIDTH, NUMBER_OF_CHANNELS])
  photos_queue = tf.to_float(photos_queue)  # convert uint8 to float32
  photos_queue = tf.reshape(photos_queue, [-1]) # flatten the tensor
  # slice the data into mini batches
  return tf.train.batch([photos_queue, labels_queue], batch_size=BATCH_SIZE)

def main(_):
  # load the training set
  training_photo_batch, training_label_batch = 
      create_photo_and_label_batches(TRAINING_IMAGES_DIR)

  # create the model
  x = training_photo_batch
  W = tf.Variable(tf.zeros([IMAGE_WIDTH * IMAGE_HEIGHT * NUMBER_OF_CHANNELS, 1],
     dtype=tf.float32))  # weights tensor
  b = tf.Variable(tf.zeros([1], dtype=tf.float32))  # bias
  y_ = training_label_batch
  y = tf.matmul(x, W) + b

  # define loss and optimizer
  cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))
  train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

  # do the training
  sess = tf.InteractiveSession()
  tf.initialize_all_variables().run()
  coord = tf.train.Coordinator()
  threads = tf.train.start_queue_runners(coord=coord)
  for i in range(TRAINING_SET_SIZE // BATCH_SIZE):
    sess.run(train_step)

  # stop the queue threads and properly close the session
  coord.request_stop()
  coord.join(threads)
  sess.close()

As you can see, the network is a very simple one, with only one neuron. I was inspired by the code listed here: Tensorflow read images with labels After running the code I am getting the following error on the very first iteration:

tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 100, current size 0)
 [[Node: batch = QueueDequeueMany[_class=["loc:@batch/fifo_queue"], component_types=[DT_FLOAT, DT_BOOL], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]

I have tried to fix the problem for a couple of hours now. So far I checked that:

filenames_list and labels_list are properly loaded,
the shapes of tensors (x, y, y_, W and b) are correct
TensorFlow graph is properly constructed and visible in TensorBoard.

No idea what else should I check. It seems that I don't understand something about Queues in TensorFlow, but I do not know what exactly. Thanks in advance for any help!

LI Xuhong · Accepted Answer

It might be caused by num_epochs=1 here tf.train.slice_input_producer([filenames, labels], num_epochs=1, shuffle=True). You can check api of slice_input_producer, where it explains: num_epochs: An integer (optional). If specified, slice_input_producer produces each slice num_epochs times before generating an OutOfRange error.

Using queues in TensorFlow to load images and labels from text file

Answers (1)

Related Questions