user30985
user30985

Reputation: 683

How to use the PASCAL VOC dataset in the xml format to build the model in tensorflow

I have a PASCAL VOC dataset. I would like to use it to build a deep learning model in tensorflow. I think that I need to convert it into the TFRecord file format to build the model but I am unsure that my thought is correct. If it is, what is the code convert the PASCAL VOC into the TFRecord file format. If it is not, do you have suggestions to load this PASCAL VOC dataset to build a model in tensorflow. This is my PASCAL VOC data set.

<annotation>
  <filename>000000000.jpg</filename>
  <source>
    <annotation>ArcGIS Pro 2.1</annotation>
  </source>
  <size>
    <width>256</width>
    <height>256</height>
    <depth>3</depth>
  </size>
  <object>
    <name>0</name>
    <bndbox>
        <xmin>209.62</xmin>
        <ymin>3.86</ymin>
        <xmax>256.00</xmax>
        <ymax>70.93</ymax>
    </bndbox>
 </object>
 <object>
    <name>0</name>
    <bndbox>
        <xmin>120.92</xmin>
        <ymin>126.09</ymin>
        <xmax>200.23</xmax>
        <ymax>209.97</ymax>
    </bndbox>
 </object>
 <object>
    <name>0</name>
    <bndbox>
        <xmin>237.72</xmin>
        <ymin>136.02</ymin>
        <xmax>256.00</xmax>
        <ymax>214.18</ymax>
    </bndbox>
 </object>

Upvotes: 1

Views: 7505

Answers (2)

Vlad
Vlad

Reputation: 8595

VOC2007 is available in the latest tensorflow-datasets==1.0.2 version, (which is not available on pip yet).

To install it run this in a terminal:

git clone https://github.com/tensorflow/datasets
cd datasets
python setup.py build
python setup.py install

Usage example (plot in Jupyter):

import tensorflow as tf
import tensorflow_datasets as tfds
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image, ImageDraw
%matplotlib inline

OUTLINE = (0, 255, 0)

builder = tfds.builder('voc2007')
builder.download_and_prepare()
datasets = builder.as_dataset()
train_data, test_data = datasets['train'], datasets['test']
iterator = train_data.repeat(1).batch(1).make_one_shot_iterator()
next_batch = iterator.get_next()

with tf.Session() as sess:
    for _ in range(1):
        batch = sess.run(next_batch)
        image = batch['image']
        bboxes = batch['objects']['bbox']
        bboxes, image = np.squeeze(bboxes), np.squeeze(image)
        pil_image = Image.fromarray(image.astype('uint8'), 'RGB')
        draw = ImageDraw.Draw(pil_image)
        height, width = image.shape[:2]
        try:
            if (isinstance(bboxes[0], np.float32)
                or isinstance(bboxes[0], np.float64)):
                bboxes = [bboxes]

            for bbox in bboxes:
                ymin, xmin, ymax, xmax = bbox
                xmin *= width
                xmax *= width
                ymin *= height
                ymax *= height
                c1 = (xmin, ymin)
                c2 = (xmax, ymin)
                c3 = (xmax, ymax)
                c4 = (xmin, ymax)
                draw.line([c1, c2, c3, c4, c1],
                          fill=OUTLINE,
                          width=3)
            asnumpy = np.array(pil_image)
            figure = plt.figure(figsize=tuple(x/50 for x in image.shape[:2]))
            plt.imshow(asnumpy)
        except TypeError:
            pass

enter image description here

Upvotes: 1

gdelab
gdelab

Reputation: 6220

The Tensorflow Object Detection API provides a tool for it, you can run the following command:

python object_detection/dataset_tools/create_pascal_tf_record.py \
    --label_map_path=<path/to/label/map.pbtxt> \
    --data_dir=<path/to/data/dir> --year=<year_directory_name> --set=<train|test|val> \
    --output_path=pascal_<train|test|val>.record

This expects a tree of the form

data_dir
|- year_dir
   |- Annotations
      |- *.xml
   |- ImageSets
      |- Layout
        |- test.txt
        |- train.txt
        |- val.txt
        |- trainval.txt
      |- Main
        |- *.txt
   |- JPEGImages
      |- *.jpg

For instance, with the usual PASCAL dataset, the result would be :

python object_detection/dataset_tools/create_pascal_tf_record.py \
    --label_map_path=object_detection/data/pascal_label_map.pbtxt \
    --data_dir=VOCdevkit --year=VOC2012 --set=val \
    --output_path=pascal_val.record

Upvotes: 3

Related Questions