Geoffrey Wu
Geoffrey Wu

Reputation: 285

convert Caffe train.txt to Tensorflow

I am trying to adapting my Caffe code to tensorflow. I wonder what is the best way to convert my train.txt and test.txt in order to work for tensorflow.

In my train.txt, the file looks like:

/folder/filename1.jpg 1

/folder/filename2.jpg 2
...

The first column is the image name and the second column is the class label

Thanks!!

Upvotes: 4

Views: 898

Answers (1)

mrry
mrry

Reputation: 126154

I'm assuming that you want to obtain a batch of identically-sized images with numeric labels. We'll use tf.decode_csv() to parse the text, tf.read_file() to load the JPEG data as a string, tf.image.decode_jpeg() to parse it into a dense tensor, and finally tf.train.batch() to build the parsed data into a batch of images. Many of these functions have a lot of options to configure, so see the documentation for further customization details.

# Set options here for whether to repeat, etc.
filename_producer = tf.string_input_producer(["train.txt"], ...)

# Read lines from the file, one at a time.
line_reader = tf.TextLineReader()
next_line = line_reader.read(filename_producer)

# Parse line into a filename and an integer label.
image_filename, label = tf.decode_csv(
    next_line, [tf.constant([], dtype=tf.string), tf.constant([], dtype=tf.int32)],
    field_delim=" ")

# Read the image as a string.
image_bytes = tf.read_file(image_filename)

# Convert the image into a 3-D tensor (height x width x channels).
image_tensor = tf.image.decode_jpeg(image_bytes, ...)

# OPTIONAL: Resize your image to a standard size if they are not already.
HEIGHT = ...
WIDTH = ...
image_tensor = tf.image.resize_image_with_crop_or_pad(image_tensor, HEIGHT, WIDTH)

# Create a batch of images.
BATCH_SIZE = 32
images, labels = tf.train.batch([image_tensor, label], BATCH_SIZE, ...)

# [...build the rest of your model...]

This example makes extensive use of TensorFlow prefetching to load the examples. The TensorFlow documentation has a how-to that explains how to use the prefetching feature, but the most important thing to note is that you must call tf.train.start_queue_runners() at the start of your session to begin prefetching:

sess = tf.Session()

# You must execute this statement to begin prefetching data.
tf.train.start_queue_runners(sess)

Upvotes: 4

Related Questions