Reputation: 261
This is the code used to convert data to TFRecord
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def _floats_feature(value):
return tf.train.Feature(float_list=tf.train.FloatList(value=value))
with tf.python_io.TFRecordWriter("train.tfrecords") as writer:
for row in train_data:
prices, label, pip = row[0],row[1],row[2]
prices = np.asarray(prices).astype(np.float32)
example = tf.train.Example(features=tf.train.Features(feature={
'prices': _floats_feature(prices),
'label': _int64_feature(label[0]),
'pip': _floats_feature(pip)
}))
writer.write(example.SerializeToString())
Feature prices is an array of shape(1,288). It converted successfully! But when decoded the data using a parse function and Dataset API.
def parse_func(serialized_data):
keys_to_features = {'prices': tf.FixedLenFeature([], tf.float32),
'label': tf.FixedLenFeature([], tf.int64)}
parsed_features = tf.parse_single_example(serialized_data, keys_to_features)
return parsed_features['prices'],tf.one_hot(parsed_features['label'],2)
It gave me the error
C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\framework\op_kernel.cc:1202] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: prices. Can't parse serialized Example. 2018-03-31 15:37:11.443073: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\framework\op_kernel.cc:1202] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: prices. Can't parse serialized Example. 2018-03-31 15:37:11.443313: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\ raise type(e)(node_def, op, message) PY\36\tensortensorflow.python.framework.errors_impl.InvalidArgumentError: Key: prices. Can't parse serialized Example. [[Node: ParseSingleExample/ParseSingleExample = ParseSingleExample[Tdense=[DT_INT64, DT_FLOAT], dense_keys=["label", "prices"], dense_shapes=[[], []], num_sparse=0, sparse_keys=[], sparse_types=[]](arg0, ParseSingleExample/Const, ParseSingleExample/Const_1)]] [[Node: IteratorGetNext_1 = IteratorGetNextoutput_shapes=[[?], [?,2]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]fl ow\core\framework\op_kernel.cc:1202] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: prices. Can't parse serialized Example.
Upvotes: 4
Views: 6529
Reputation: 261
I found the problem. Instead of using tf.io.FixedLenFeature
for parsing an array, use tf.io.FixedLenSequenceFeature
(for TensorFlow 1, use tf.
instead of tf.io.
)
Upvotes: 10
Reputation: 11
Exactly the same thing happens to me with reading float32
data lists from TFrecord
files.
I get Can't parse serialized Example when executing sess.run([time_tensor, frequency_tensor, frequency_weight_tensor])
with tf.FixedLenFeature
, though tf.FixedLenSequenceFeature
seems to be working fine.
My feature format for reading files (the working one) is as follows:
feature_format = {
'time': tf.FixedLenSequenceFeature([], tf.float32, allow_missing = True),
'frequencies': tf.FixedLenSequenceFeature([], tf.float32, allow_missing = True),
'frequency_weights': tf.FixedLenSequenceFeature([], tf.float32, allow_missing = True)
}
The encoding part is:
feature = {
'time': tf.train.Feature(float_list=tf.train.FloatList(value=[*some single value*]) ),
'frequencies': tf.train.Feature(float_list=tf.train.FloatList(value=*some_list*) ),
'frequency_weights': tf.train.Feature(float_list=tf.train.FloatList(value=*some_list*) )
}
This happens with TensorFlow 1.12 on Debian machine without GPU offloading (i.e. only CPU used with TensorFlow)
Is there any misuse from my side? Or is it a bug in the code or documentation? I can think on contributing/upstreaming any fixes if that would benefit anyone...
Upvotes: 0
Reputation: 1920
If your feature is a fixed 1-d array then using tf.FixedLenSequenceFeature is not correct at all. As the documentation mentioned, the tf.FixedLenSequenceFeature is for a input data with dimension 2 and higher. In this example you need to flatten your price array to become (288,) and then for decoding part you need to mention the array dimension.
Encode:
example = tf.train.Example(features=tf.train.Features(feature={
'prices': _floats_feature(prices.tolist()),
'label': _int64_feature(label[0]),
'pip': _floats_feature(pip)
Decode:
keys_to_features = {'prices': tf.FixedLenFeature([288], tf.float32),
'label': tf.FixedLenFeature([], tf.int64)}
Upvotes: 4
Reputation: 307
I had the same issue while carelessly modifying some scripts, it was caused by slightly different data shape. I had to change the shape to match expected shape, eg (A, B)
to (1, A, B)
. I used np.ravel()
for flattening.
Upvotes: 0
Reputation: 768
You can't store an n-dimensional array as a float feature as float features are simple lists. You have to flatten prices
into a list by doing prices.tolist()
. If you need to recover the n-dimensional array from the flattened float feature, then you can do prices = np.reshape(float_feature, original_shape)
.
Upvotes: 1