user1274878
user1274878

Reputation: 1405

Tensorflow: feeding data to the graph

I have written an algorithm using Tensorflow. The structure of my code is as follows:

  1. Read data from a csv and store it in a list of lists, where each list contains a single line from a csv.

  2. Use the feed_dict approach to feed the graph a single line of data. This is done in a loop till all the lines are processed.

The TF graph is executed on the GPU. My question is related to the data transfer which will happen from the CPU to the GPU. Does using feed_dict mean that there will be lots of small transfers from the host to the device? If yes, would it be feasible to do a bulk transfer using feed_dict and use a loop in the TF graph?

Upvotes: 3

Views: 1295

Answers (1)

Rick Lentz
Rick Lentz

Reputation: 503

Yes. If the file size is small (e.g. single line of text) there will likely be many small host to device transfers.

Here is an example of using feed_dict with a custom fill_feed_dict function: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/mnist/fully_connected_feed.py

For your use though, it may be easier to handle many small files using TensorFlow's QueueRunner. This will create a pool of reader threads to prefetch your data and help speed up data availability to the TensorFlow graph.

 Create the graph, etc.
init_op = tf.global_variables_initializer()

# Create a session for running operations in the Graph.
sess = tf.Session()

# Initialize the variables (like the epoch counter).
sess.run(init_op)

# Start input enqueue threads.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)

try:
    while not coord.should_stop():
        # Run training steps or whatever
        sess.run(train_op)

except tf.errors.OutOfRangeError:
    print('Done training -- epoch limit reached')
finally:
    # When done, ask the threads to stop.
    coord.request_stop()

# Wait for threads to finish.
coord.join(threads)
sess.close()

(https://www.tensorflow.org/how_tos/reading_data/#batching)

Upvotes: 2

Related Questions