user1371314
user1371314

Reputation: 832

Tensorflow, replacing feed_dict with Dataset.from_generator

I have an existing model which reads through a text file in a loop, the resulting input and output looks like this:

    self.X = tf.placeholder('float32', shape=[None, None, max_word_length, ALPHABET_SIZE], name='X')
    self.Y = tf.placeholder('float32', shape=[None, 2], name='Y')
    ...
    _, c, a = sess.run([optimizer, cost, acc], feed_dict={self.X: batch_x, self.Y: batch_y})

But now i want to convert to using the Dataset.from_generator method, to get started i created a wrapper class around my text reader that implemted the generator function, this all works well, and returns the input data as expected:

    dsr = DatasetReader(TRAIN_SET, BATCH_SIZE, max_word_length)
    ds = tf.data.Dataset.from_generator(dsr.generator, (tf.float32, tf.float32))
    ds = ds.prefetch(2)
    dsi = ds.make_one_shot_iterator()
    self.X, self.Y = dsi.get_next()
    _, c, a = sess.run([optimizer, cost, acc])

However i am getting an error

InvalidArgumentError: You must feed a value for placeholder tensor 'X' with dtype float and shape [?,?,16,70]

And i assume this is because i have declared X/Y inputs as placeholders, the documentation states the values must be fed via feed_dict.

So i got a couple of questions:

  1. How do i convert from feed_dict and placeholders to using the from_generator correctly? I want to keep the naming of the tensors X and Y so i am able to feed them in by that name during inference

  2. More generally, I dont see how the dataset and its iterator is linked to the session, is it linked purely by virtue of the iterator outputs being used as inputs to other operations in the graph?

Upvotes: 2

Views: 1425

Answers (1)

David Parks
David Parks

Reputation: 32081

You get to drop the placeholders altogether. If you have a placeholder defined, you did it wrong. It should just be this, as you have it:

self.X, self.Y = dsi.get_next()
# continue your network

Something in your code appears to be trying to use the placeholder when it should be trying to use self.X from dsi.get_next()

How do i convert from feed_dict and placeholders to using the from_generator correctly? I want to keep the naming of the tensors X and Y so i am able to feed them in by that name during inference

You can name your generator outputs using yield {"X": data, "y": labels}

More generally, I dont see how the dataset and its iterator is linked to the session, is it linked purely by virtue of the iterator outputs being used as inputs to other operations in the graph?

The iterator is an element in the graph, it works just like every other operation in tensorflow, when it's part of the dependency needed to compute the results it performs its defined operation.

Upvotes: 2

Related Questions