Reputation: 19957
I'm running some tutorial code from text classification
I can run the scripts and it worked but when I tried to run it line by line trying to understand what each step is doing, I got a bit confused at this step:
test_input_fn = tf.estimator.inputs.numpy_input_fn(
x={WORDS_FEATURE: x_test},
y=y_test,
num_epochs=1,
shuffle=False)
classifier.train(input_fn=train_input_fn, steps=100)
I know conceptually train_input_fn is feeding data to the training function but how I can manually call this fn to inspect what's in it?
I've traced the code and found out the train_input_fn function feeds data to the following 2 variables:
features
Out[15]: {'words': <tf.Tensor 'random_shuffle_queue_DequeueMany:1' shape=(560, 10) dtype=int64>}
labels
Out[16]: <tf.Tensor 'random_shuffle_queue_DequeueMany:2' shape=(560,) dtype=int32>
When I tried to evaluate the features variable by doing a sess.run(features), my terminal seems to get stuck and stops responding.
What's the right way to inspect content of variables like these?
Thank you!
Upvotes: 4
Views: 782
Reputation: 4183
Based on the numpy_input_fn
documentation and the behaviour (hanging) I imagine the underlying implementation depends on a queue runner. Hanging occurs when queue runners aren't started. Try modifying your session running script to something like the following, based on this guide:
with tf.Session() as sess:
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
try:
for step in xrange(1000000):
if coord.should_stop():
break
features_data = sess.run(features)
print(features_data)
except Exception, e:
# Report exceptions to the coordinator.
coord.request_stop(e)
finally:
# Terminate as usual. It is safe to call `coord.request_stop()` twice.
coord.request_stop()
coord.join(threads)
Alternatively, I'd encourage you to check out the tf.data.Dataset
interface (possible tf.contrib.data.Dataset
in tensorflow 1.3 or prior). You can get similar input/labels tensors without using queues with Dataset.from_tensor_slices
. Creation is slightly more involved, but the interface is much more flexible and the implementation doesn't use queue runners, meaning session running is much simpler.
import tensorflow as tf
import numpy as np
x_data = np.random.random((100000, 2))
y_data = np.random.random((100000,))
batch_size = 2
buff = 100
def input_fn():
# possible tf.contrib.data.Dataset.from... in tf 1.3 or earlier
dataset = tf.data.Dataset.from_tensor_slices((x_data, y_data))
dataset = dataset.repeat().shuffle(buff).batch(batch_size)
x, y = dataset.make_one_shot_iterator().get_next()
return x, y
x, y = input_fn()
with tf.Session() as sess:
print(sess.run([x, y]))
Upvotes: 3