zbess
zbess

Reputation: 852

Loading images and labels from csv file using Tensorflow

I've been trying to get the TensorFlow io api to work this morning.

After some research, I managed to read in the data, but I can not bind image and label correctly when dequeuing.

Here is the code I wrote:

# load csv content
csv_path = tf.train.string_input_producer(['list1.csv', 'list2.csv'])
textReader = tf.TextLineReader()
_, csv_content = textReader.read(csv_path)
im_name, label = tf.decode_csv(csv_content, record_defaults=[[""], [1]])

# load images
im_content = tf.read_file(im_dir+im_name)
image = tf.image.decode_png(im_content, channels=3)
image = tf.cast(image, tf.float32) / 255.
image = tf.image.resize_images(image, 640, 640)

# make batches
im_batch, lb_batch = tf.train.batch([image, label], batch_size=batch)

The order of im_batch and lb_batch is messed up (the images are bound to random labels).

Any idea what's happening? Thanks.

Upvotes: 3

Views: 4027

Answers (1)

Ted Zhai
Ted Zhai

Reputation: 76

The code you listed has no problem.

im_batch, lb_batch = tf.train.batch([image, label], batch_size=batch)

The line above binds image and label to the same queue, so whenever you carry out an operation on either im_batch or lb_batch, the queue will pop out a batch number of data units from the other. So a common mistake may be calling im_batch.eval() and lb_batch.eval() separately:

# this is wrong
images = im_batch.eval()
labels = lb_batch.eval()

after im_batch.eval() was called, lb_batch pop out the same amount of data units as well because they are bonded in one queue. so when calling lb_batch.eval() right next, it actually gives the labels of next batch.

The right way to do is to put im_batch and lb_batch into a unit operation, either a graph or sess.run() op list:

  1. this is correct

    loss = your_network_model(im_batch, lb_batch) loss.eval()

2.

# this is correct
images, labels = sess.run([im_batch, lb_batch])

Upvotes: 6

Related Questions