Reputation: 1460
I'm trying to generate TFRecords using this code provided here Here
The issue with this is that it's very outdated and a lot of the code was built for Tensorflow 1.x.x
So Is there any way currently to generate TFRecords from CSV in using the latest version of Tensorflow?
I'm really at a lost in terms of finding help :(
Thanks!
Upvotes: 1
Views: 2593
Reputation: 106
First you need to access the data inside your CSV file using pandas or another library.
Then:
Create a writer by using this function.
tf.io.TFRecordWriter(tf_record_filename)
Depending on your data add the necessary functions to your code. If you are using image data, use the _bytes_feature and make sure to convert your image to bytes by using np.ndarray.tostring()
def _bytes_feature(value):
"""Returns a bytes_list from a string / byte."""
if isinstance(value, type(tf.constant(0))):
value = value.numpy() # BytesList won't unpack a string from an EagerTensor.
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def _float_feature(value):
"""Returns a float_list from a float / double."""
return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))
def _int64_feature(value):
"""Returns an int64_list from a bool / enum / int / uint."""
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
Iterate through your dataset and create a tf.Example every iteration by feeding it a dictionary containing your data. Here's an example where the arguments of the functions are the data you read from the actual row of your CSV file.
example = tf.train.Example(features=tf.train.Features(feature={
'image_data': _bytes_feature(some_bytes_image),
'float_data': _float_feature(some_float),
'int64_data': _int_feature(some_integer)
}))
Don't forget to write it in your tfrecord by using this line of code inside your loop.
writer.write((example.SerializeToString()))
Close the writer when it's over
writer.close()
If you need any other information about it you should consult the official documentation
Upvotes: 1