plhn
plhn

Reputation: 5273

Why tf.placeholder returns serialized example(s)?

I’m reading mnist_saved_model.py

At line 62~64, tf.placeholder returns serialized example. And tf.parse_example function uses it like this,

serialized_tf_example = tf.placeholder(tf.string, name='tf_example’)
feature_configs = {'x': tf.FixedLenFeature(shape=[784], dtype=tf.float32),}
tf_example = tf.parse_example(serialized_tf_example, feature_configs)

I knew placeholder returns a Tensor. Then, what exactly ‘serialized tf.Example’ means?

Upvotes: 1

Views: 482

Answers (1)

Lior
Lior

Reputation: 2019

tf.placeholders can be used as entry points to you model for different kinds of data. For example, you could use x = tf.placeholder(shape=[784], dtype=tf.float32), which is suitable for feeding NumPy arrays with shape [784] and type float.

In your example, the placeholder should be fed with a string type. This is useful if you obtain your data directly from a text file or from a TFRecord / protobuf (which is a format for serializing your data). (To obtain the string from a protobuf, you can use tf.python_io.tf_record_iterator. You can also use the tf.data.Dataset API).

When you obtain the string from a TFRecord file, it is a serialized representation of your example. You now want to parse it into a tf.Tensor. This is done using tf.parse_example. The feature_configs dictionary specifies how to parse the data from the string.

(Note: tf.parse_example does not actually parse anything - it only adds an operation to the computational graph that can perform parsing. The parsing itself will occur when you run the graph, using sess.run(...))

The tensor, after parsing, is tf_example['x'].


EDIT:

To elaborate some more, as a response to your comment: In your example, serialized_tf_example is a tf.Tensor (as can be verified by running type(serialized_tf_example)). Its data type are strings (as can be verified by running serialized_tf_example.dtype). As with all tensors, this means that you can feed it any multidimensional array of its data type (which is strings). (If in the tf.placeholder() call you would have given a value to the shape argument, that would force you to use a specific shape for the multidimensional array). So, for example, these are valid calls:

sess = tf.Session()
sess.run(serialized_tf_example, feed_dict={serialized_tf_example:'abc'})
sess.run(serialized_tf_example, feed_dict={serialized_tf_example:[['some','strings'],['in an','array']]})

This tensor instance is named serialized_example, since its purpose is to be fed with a serialized encoding of the example that is going to be processed. This tensor is then converted from this encoding to a numerical multidimensional array, using tf.parse_example()

Upvotes: 1

Related Questions