Reputation: 683
I have a PASCAL VOC dataset. I would like to use it to build a deep learning model in tensorflow. I think that I need to convert it into the TFRecord file format to build the model but I am unsure that my thought is correct. If it is, what is the code convert the PASCAL VOC into the TFRecord file format. If it is not, do you have suggestions to load this PASCAL VOC dataset to build a model in tensorflow. This is my PASCAL VOC data set.
<annotation>
<filename>000000000.jpg</filename>
<source>
<annotation>ArcGIS Pro 2.1</annotation>
</source>
<size>
<width>256</width>
<height>256</height>
<depth>3</depth>
</size>
<object>
<name>0</name>
<bndbox>
<xmin>209.62</xmin>
<ymin>3.86</ymin>
<xmax>256.00</xmax>
<ymax>70.93</ymax>
</bndbox>
</object>
<object>
<name>0</name>
<bndbox>
<xmin>120.92</xmin>
<ymin>126.09</ymin>
<xmax>200.23</xmax>
<ymax>209.97</ymax>
</bndbox>
</object>
<object>
<name>0</name>
<bndbox>
<xmin>237.72</xmin>
<ymin>136.02</ymin>
<xmax>256.00</xmax>
<ymax>214.18</ymax>
</bndbox>
</object>
Upvotes: 1
Views: 7505
Reputation: 8595
VOC2007
is available in the latest tensorflow-datasets==1.0.2
version, (which is not available on pip
yet).
To install it run this in a terminal:
git clone https://github.com/tensorflow/datasets
cd datasets
python setup.py build
python setup.py install
Usage example (plot in Jupyter
):
import tensorflow as tf
import tensorflow_datasets as tfds
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image, ImageDraw
%matplotlib inline
OUTLINE = (0, 255, 0)
builder = tfds.builder('voc2007')
builder.download_and_prepare()
datasets = builder.as_dataset()
train_data, test_data = datasets['train'], datasets['test']
iterator = train_data.repeat(1).batch(1).make_one_shot_iterator()
next_batch = iterator.get_next()
with tf.Session() as sess:
for _ in range(1):
batch = sess.run(next_batch)
image = batch['image']
bboxes = batch['objects']['bbox']
bboxes, image = np.squeeze(bboxes), np.squeeze(image)
pil_image = Image.fromarray(image.astype('uint8'), 'RGB')
draw = ImageDraw.Draw(pil_image)
height, width = image.shape[:2]
try:
if (isinstance(bboxes[0], np.float32)
or isinstance(bboxes[0], np.float64)):
bboxes = [bboxes]
for bbox in bboxes:
ymin, xmin, ymax, xmax = bbox
xmin *= width
xmax *= width
ymin *= height
ymax *= height
c1 = (xmin, ymin)
c2 = (xmax, ymin)
c3 = (xmax, ymax)
c4 = (xmin, ymax)
draw.line([c1, c2, c3, c4, c1],
fill=OUTLINE,
width=3)
asnumpy = np.array(pil_image)
figure = plt.figure(figsize=tuple(x/50 for x in image.shape[:2]))
plt.imshow(asnumpy)
except TypeError:
pass
Upvotes: 1
Reputation: 6220
The Tensorflow Object Detection API provides a tool for it, you can run the following command:
python object_detection/dataset_tools/create_pascal_tf_record.py \
--label_map_path=<path/to/label/map.pbtxt> \
--data_dir=<path/to/data/dir> --year=<year_directory_name> --set=<train|test|val> \
--output_path=pascal_<train|test|val>.record
This expects a tree of the form
data_dir
|- year_dir
|- Annotations
|- *.xml
|- ImageSets
|- Layout
|- test.txt
|- train.txt
|- val.txt
|- trainval.txt
|- Main
|- *.txt
|- JPEGImages
|- *.jpg
For instance, with the usual PASCAL dataset, the result would be :
python object_detection/dataset_tools/create_pascal_tf_record.py \
--label_map_path=object_detection/data/pascal_label_map.pbtxt \
--data_dir=VOCdevkit --year=VOC2012 --set=val \
--output_path=pascal_val.record
Upvotes: 3