SunAwtCanvas
SunAwtCanvas

Reputation: 1460

Tensorflow 2.1.0 - Generate TFRecord from CSV

I'm trying to generate TFRecords using this code provided here Here

The issue with this is that it's very outdated and a lot of the code was built for Tensorflow 1.x.x

So Is there any way currently to generate TFRecords from CSV in using the latest version of Tensorflow?

I'm really at a lost in terms of finding help :(

Thanks!

Upvotes: 1

Views: 2593

Answers (1)

Silvestre Bahi
Silvestre Bahi

Reputation: 106

First you need to access the data inside your CSV file using pandas or another library.

Then:

  • Create a writer by using this function.

    tf.io.TFRecordWriter(tf_record_filename)
    
  • Depending on your data add the necessary functions to your code. If you are using image data, use the _bytes_feature and make sure to convert your image to bytes by using np.ndarray.tostring()

    def _bytes_feature(value):
      """Returns a bytes_list from a string / byte."""
      if isinstance(value, type(tf.constant(0))):
        value = value.numpy() # BytesList won't unpack a string from an EagerTensor.
      return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
    
    def _float_feature(value):
      """Returns a float_list from a float / double."""
      return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))
    
    def _int64_feature(value):
      """Returns an int64_list from a bool / enum / int / uint."""
      return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
    
  • Iterate through your dataset and create a tf.Example every iteration by feeding it a dictionary containing your data. Here's an example where the arguments of the functions are the data you read from the actual row of your CSV file.

    example = tf.train.Example(features=tf.train.Features(feature={
        'image_data': _bytes_feature(some_bytes_image),
        'float_data': _float_feature(some_float),
        'int64_data': _int_feature(some_integer)
    }))
    
  • Don't forget to write it in your tfrecord by using this line of code inside your loop.

    writer.write((example.SerializeToString()))
    
  • Close the writer when it's over

    writer.close()
    

If you need any other information about it you should consult the official documentation

Upvotes: 1

Related Questions