Feng Tian
Feng Tian

Reputation: 21

tensorflow estimator from_generator, how to set TensorShape?

I am trying use a generator to feed data into estimator. The following is the code. However, when try to run, I got the following error:

Update2: I finally made it work. So the correct tensorshape is ([], [], [])

Update: I added tensorshape ([None], [None], [None]), then I changed ds.batch(10), to an assignment ds = ds.batch(10)

but still got error.

Traceback (most recent call last):
  File "xyz.py", line 79, in <module>
    tf.app.run(main=main, argv=None)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "xyz.py", line 67, in main
    model.train(input_fn=lambda: input_fn(100))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 302, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 783, in _train_model
    _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 521, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 892, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 967, in run
    raise six.reraise(*original_exc_info)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 952, in run
    return self._sess.run(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1024, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 827, in run
    return self._sess.run(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 889, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1120, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1317, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1336, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: exceptions.ValueError: `generator` yielded an element of shape () where an element of shape (?,) was expected.
         [[Node: PyFunc = PyFunc[Tin=[DT_INT64], Tout=[DT_INT64, DT_STRING, DT_FLOAT], token="pyfunc_1"](arg0)]]
         [[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[?,?], [?,?], [?,?]], output_types=[DT_INT64, DT_STRING, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]

So my question, how to set the TensorShape? The from generator takes a third argument of TensorShape but I cannot find any example/doc on how to set it. Any help?

Thanks,

def gen(nn):
    ii = 0
    while ii < nn:
        ii += 1
        yield ii, 't{0}'.format(ii), ii*2

def input_fn(n):
    ds = tf.data.Dataset.from_generator(lambda: gen(n), (tf.int64, tf.string, tf.float32), ([None], [None], [None])) 
    ds = ds.batch(10)
    x, y, z = ds.make_one_shot_iterator().get_next()
    return {'x': x, 'y': y}, tf.greater_equal(z, 10)

def build_columns():
    x = tf.feature_column.numeric_column('x')
    y = tf.feature_column.categorical_column_with_hash_bucket('y', hash_bucket_size=5)
    return [x, y]

def build_estimator():
    run_config = tf.estimator.RunConfig().replace(
            session_config=tf.ConfigProto(device_count={'GPU': 0}))
    return tf.estimator.LinearClassifier(model_dir=FLAGS.model_dir, feature_columns=build_columns(), config=run_config)

def main(unused):
  # Clean up the model directory if present
  shutil.rmtree(FLAGS.model_dir, ignore_errors=True)
  model = build_estimator()

  # Train and evaluate the model every `FLAGS.epochs_per_eval` epochs.
  for n in range(FLAGS.train_epochs // FLAGS.epochs_per_eval):
    model.train(input_fn=lambda: input_fn(100))
    results = model.evaluate(input_fn=lambda: input_fn(20))

Upvotes: 2

Views: 1655

Answers (1)

Olivier Moindrot
Olivier Moindrot

Reputation: 28198

As mentioned by @FengTian in an update, the correct answer was to use shape ([], [], []) as the output shape of the generator:

tf.data.Dataset.from_generator(lambda: gen(n), (tf.int64, tf.string, tf.float32), ([], [], []))

Upvotes: 1

Related Questions