Reputation: 137
I am trying to read images and labels from a TFRecord file, and then train with these. I know that my TFRecord file exists, and have checked that it does contain 1000 images and labels. My problem only seems to arise when I want to pipe as input to train. I am new to python and tensor flow, and not sure how to fix the problem
I get the following error occuring at tf.train.shuffle_batch
...
Caused by op 'shuffle_batch', defined at: File "C:/AI/projects/DataGen/train.py", line 40, in images_batch, labels_batch = tf.train.shuffle_batch([image, label], batch_size=10, capacity=1000,min_after_dequeue=2)
...
Here is my code, cobbled together from various mnist examples
import tensorflow as tf
def read_and_decode_single_example(filename):
# first construct a queue containing a list of filenames.
# this lets a user split up there dataset in multiple files to keep
# size down
filename_queue = tf.train.string_input_producer([filename],
num_epochs=None)
# Unlike the TFRecordWriter, the TFRecordReader is symbolic
reader = tf.TFRecordReader()
# One can read a single serialized example from a filename
# serialized_example is a Tensor of type string.
_, serialized_example = reader.read(filename_queue)
# The serialized example is converted back to actual values.
# One needs to describe the format of the objects to be returned
feature = {'image': tf.FixedLenFeature([], tf.string),
'label': tf.FixedLenFeature([], tf.int64)}
features = tf.parse_single_example(serialized_example, features=feature)
# now return the converted data
label = tf.cast(features['label'], tf.float32)
image = tf.decode_raw(features['image'], tf.float32)
image = tf.reshape(image, [28, 28, 3])
return label, image
with tf.Session() as sess:
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
# get single examples
label, image = read_and_decode_single_example("train.tfrecords")
image = tf.cast(image, tf.float32) / 255.
# groups examples into batches randomly
images_batch, labels_batch = tf.train.shuffle_batch([image, label], batch_size=10, capacity=1000, min_after_dequeue=2)
# The model is:
#
# Y = softmax( X * W + b)
# X: matrix for rgb images of 28x28 pixels, flattened (there are 100 images in a mini-batch)
# W: weight matrix with (28x28x3) lines and 10 columns
# b: bias vector with 10 dimensions
# +: add with broadcasting: adds the vector to each line of the matrix (numpy)
# softmax(matrix) applies softmax on each line
# softmax(line) applies an exp to each value then divides by the norm of the resulting line
# Y: output matrix with 100 lines and 10 columns
# input X: 28x28x3 RGB images
X = images_batch
# correct answers will go here
Y_ = labels_batch
# weights W[28 * 28 * 3, 10]
W = tf.Variable(tf.zeros([28 * 28 * 3, 10]))
# biases b[10]
b = tf.Variable(tf.zeros([10]))
# flatten the images into a single line of pixels
# -1 in the shape definition means "the only possible dimension that will preserve the number of elements"
XX = tf.reshape(X, [-1, 28 * 28 * 3])
# The model
Y = tf.nn.softmax(tf.matmul(XX, W) + b)
# loss function: cross-entropy = - sum( Y_i * log(Yi) )
# Y: the computed output vector
# Y_: the desired output vector
# cross-entropy
# log takes the log of each element, * multiplies the tensors element by element
# reduce_mean will add all the components in the tensor
# so here we end up with the total cross-entropy for all images in the batch
cross_entropy = -tf.reduce_mean(Y_ * tf.log(Y)) * 100.0 # normalized for batches of 100 images,
# *10 because "mean" included an unwanted division by 10
# accuracy of the trained model, between 0 (worst) and 1 (best)
correct_prediction = tf.equal(tf.argmax(Y, 1), tf.argmax(Y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# training, learning rate = 0.005
train_step = tf.train.GradientDescentOptimizer(0.005).minimize(cross_entropy)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(100 + 1):
print(i)
sess.run(train_step)
coord.request_stop()
# Wait for threads to stop
coord.join(threads)
sess.close()
Upvotes: 0
Views: 684
Reputation: 137
I moved the initialization to just before the tf.train.start_queue_runners call and that solved the problem i.e. after the model is setup
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
Upvotes: 1