Reputation: 285
I am trying to adapting my Caffe code to tensorflow. I wonder what is the best way to convert my train.txt and test.txt in order to work for tensorflow.
In my train.txt, the file looks like:
/folder/filename1.jpg 1
/folder/filename2.jpg 2
...
The first column is the image name and the second column is the class label
Thanks!!
Upvotes: 4
Views: 898
Reputation: 126154
I'm assuming that you want to obtain a batch of identically-sized images with numeric labels. We'll use tf.decode_csv()
to parse the text, tf.read_file()
to load the JPEG data as a string, tf.image.decode_jpeg()
to parse it into a dense tensor, and finally tf.train.batch()
to build the parsed data into a batch of images. Many of these functions have a lot of options to configure, so see the documentation for further customization details.
# Set options here for whether to repeat, etc.
filename_producer = tf.string_input_producer(["train.txt"], ...)
# Read lines from the file, one at a time.
line_reader = tf.TextLineReader()
next_line = line_reader.read(filename_producer)
# Parse line into a filename and an integer label.
image_filename, label = tf.decode_csv(
next_line, [tf.constant([], dtype=tf.string), tf.constant([], dtype=tf.int32)],
field_delim=" ")
# Read the image as a string.
image_bytes = tf.read_file(image_filename)
# Convert the image into a 3-D tensor (height x width x channels).
image_tensor = tf.image.decode_jpeg(image_bytes, ...)
# OPTIONAL: Resize your image to a standard size if they are not already.
HEIGHT = ...
WIDTH = ...
image_tensor = tf.image.resize_image_with_crop_or_pad(image_tensor, HEIGHT, WIDTH)
# Create a batch of images.
BATCH_SIZE = 32
images, labels = tf.train.batch([image_tensor, label], BATCH_SIZE, ...)
# [...build the rest of your model...]
This example makes extensive use of TensorFlow prefetching to load the examples. The TensorFlow documentation has a how-to that explains how to use the prefetching feature, but the most important thing to note is that you must call tf.train.start_queue_runners()
at the start of your session to begin prefetching:
sess = tf.Session()
# You must execute this statement to begin prefetching data.
tf.train.start_queue_runners(sess)
Upvotes: 4