karl_TUM
karl_TUM

Reputation: 5929

Good way to input PASCAL-VOC 2012 training data and labels with tensorflow

I want to do object detection of PASCAL-VOC 2012 dataset with tensorflow.

I want to input the whole image with object labels and the corresponding bounding boxes into the tensorflow for training.

Is there any good way to write a data file for tensorflow to read? Or just read the original XML file in tensorflow?

Thank you very much.

Here is an image example: enter image description here

Upvotes: 3

Views: 7019

Answers (3)

user3970726
user3970726

Reputation:

There are pre-made tools for that, look for Tensorflow models repository. Their approach in essence is:

  1. Parse the xml annotation files and flatten the data structure within them.
  2. Produce tfrecord that combines annotation and images,

this is arguably the best way.

For sake of training you can implement your own converter that takes a pair (xml,image) and saves into tfrecord example.

Tfrecord is tensorflow format for storing data, every tfrecord file is bascially a list containing examples, every example is an object that holds data in key : value pairs, where value is an array of primitive types (int, string, float) and key is a string.

So, first you flatten your xml annotation to match constraints of tfrecord file then you use tensorflow TFRecordWriter to save data into file. Check Tensorflow API - it will pay off.

Upvotes: 2

Kaustav Mukherjee
Kaustav Mukherjee

Reputation: 1

First use labelImg-master to convert the boxed pictures into VOC Annotated format the use my utility from the link below to convert the VOC Annotated Files to npz .npz is a very good format and performance efficient way to store both data and label for image processing using KERAS on Tensorflow.

Below is the code to convert any PASCAL VOC ANNOTATED format files to npz.

https://github.com/MATRIX4284/VOC_NPZ

Upvotes: -1

ckorzhik
ckorzhik

Reputation: 798

It seems that TF have no support of xml files yet.

  1. You can try to make batches by yourself and feed them to TF placeholders. https://www.tensorflow.org/versions/r0.10/how_tos/reading_data/index.html#feeding

  2. You can write your own file format and your own decoder. Then you can read file and get file bytes with tf.decode_raw function and do whatever you whant. Related question if you whant to read multiple files simultaneously: Tensorflow read images with labels

I think that first option is easier to implement.

Upvotes: 0

Related Questions