Lukeyb
Lukeyb

Reputation: 857

TensorFlow: How to use 'num_epochs' in a string_input_producer

I can't enable epoch limits on my string_input_producer without getting a OutOfRange error (requested x, current size 0). It doesn't seem to matter how many elements I request, there is always 0 available.

Here is my FileQueue builder:

def get_queue(base_directory):
    files = [f for f in os.listdir(base_directory) if f.endswith('.bin')]
    shuffle(files)
    file = [os.path.join(base_directory, files[0])]
    fileQueue = tf.train.string_input_producer(file, shuffle=False, num_epochs=1)

    return fileQueue

If I remove num_epochs=1 from the string_input_producer it can create samples fine.

My input pipeline:

def input_pipeline(instructions, fileQueue):
    example, label, feature_name_list = read_binary_format(fileQueue, instructions)

    num_preprocess_threads = 16
    capacity = 20

    example, label = tf.train.batch(
        [example, label],
        batch_size=50000,    # set the batch size way bigger so we always return the full amount of samples from the file
        allow_smaller_final_batch=True,
        capacity=capacity,
        num_threads=num_preprocess_threads)

    return example, label

And lastly my session:

with tf.Session(graph=tf.Graph()) as sess:
    train_inst_set = sf.DeserializationInstructions.from_filename(os.path.join(input_dir, "Train/config.json"))
    fileQueue = sf.get_queue(os.path.join(input_dir, "Train"))
    features_train, labels_train = sf.input_pipeline(train_inst_set, fileQueue)
    sess.run(tf.global_variables_initializer())

    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord, sess=sess)

    train_feature_batch, train_label_batch = sess.run([features_train, labels_train])

Upvotes: 0

Views: 1305

Answers (1)

Lukeyb
Lukeyb

Reputation: 857

The issue was caused by this: Issue #1045

For whatever reason, tf.global_variable_initialiser does not initialise all variables. You need to initialise the local variables too.

Add

sess.run(tf.group(tf.global_variables_initializer(), tf.local_variables_initializer()))

to your session.

Upvotes: 1

Related Questions