Jame
Jame

Reputation: 3854

How to update object information in XML?

I have a XML file (gt.xml) as follows:

<annotation>
    <object>
        <name>class1</name>
        <pose>Unspecified</pose>
        <bndbox>
            <xmin>805</xmin>
            <ymin>140</ymin>
            <xmax>975</xmax>
            <ymax>300</ymax>
        </bndbox>
    </object>
    <object>
        <name>class2</name>
        <pose>Unspecified</pose>
        <bndbox>
            <xmin>816</xmin>
            <ymin>386</ymin>
            <xmax>1000</xmax>
            <ymax>575</ymax>
        </bndbox>
    </object>
</annotation>

And I have an object that contains new information as:

objects=[{'name': 'class1', 'bbox': [813, 141, 964, 296]}, {'name': 'class2', 'bbox': [824, 389, 989, 568]}]

I want to update the values of bbox corresponding to name to the XML file. So the expected new information in the gt.xml XML file will be:

<annotation>
    <object>
        <name>class1</name>
        <pose>Unspecified</pose>
        <bndbox>
            <xmin>813</xmin> 
            <ymin>141</ymin>
            <xmax>964</xmax>
            <ymax>296</ymax>
        </bndbox>
    </object>
    <object>
        <name>class2</name>
        <pose>Unspecified</pose>
        <bndbox>
            <xmin>824</xmin> 
            <ymin>389</ymin>
            <xmax>989</xmax>
            <ymax>568</ymax>
        </bndbox>
    </object>
</annotation>

This is my update function:

def update_xml(filename, object):
  """ Parse a PASCAL VOC xml file """
  xml_file = os.path.join(dst_xml_dir, filename)
  tree = ET.parse(xml_file)
  print (len(object))
  for obj in tree.findall('object'):
      for obj_rotate in range(len(object)):
          print (object[obj_rotate]['bbox'])
          if(obj.find('name').text == object[obj_rotate]['name']):
              bbox=object[obj_rotate]['bbox']
              obj.find('bndbox').find('xmin').text= str(bbox[0])
              obj.find('bndbox').find('ymin').text = str(bbox[1])
              obj.find('bndbox').find('xmax').text= str(bbox[2])
              obj.find('bndbox').find('ymax').text = str(bbox[3])
  tree.write(xml_file)

It can update the XML but it has issues that I must use two loops to insert the class name in the condition if. I guess we can have a better way to use a single loop from objects information. Could we do it in Python?

Upvotes: 1

Views: 901

Answers (2)

Vivek Kalyanarangan
Vivek Kalyanarangan

Reputation: 9081

My Approach is to first come up with a modified dict -

objects=[{'name': 'class1', 'bbox': [813, 141, 964, 296]}, {'name': 'class2', 'bbox': [824, 389, 989, 568]}]

objects_an = { obj['name']:obj['bbox'] for obj in objects }

This would give -

{'class2': [824, 389, 989, 568], 'class1': [813, 141, 964, 296]}

That out of the way, its just the traversal now. Here is the full code -

objects=[{'name': 'class1', 'bbox': [813, 141, 964, 296]}, {'name': 'class2', 'bbox': [824, 389, 989, 568]}]

objects_an = { obj['name']:obj['bbox'] for obj in objects }
print(objects_an)

from xml import etree
e = etree.ElementTree.parse('gt.xml')
root = e.getroot()

obj_xml = root.findall('object')

for obj in obj_xml:
    name = obj.find('name')
    bbox_mod = objects_an[name.text] # do a try catch here

    bbox_original = obj.find('bndbox')
    bbox_original.find('xmin').text = str(bbox_mod[0])
    bbox_original.find('ymin').text = str(bbox_mod[1])
    bbox_original.find('xmax').text = str(bbox_mod[2])
    bbox_original.find('ymax').text = str(bbox_mod[3])

e.write('gt2.xml')

You can just wrap this up in a function and it should do the trick. Hope this helps!

Upvotes: 1

Imran
Imran

Reputation: 13458

You can use the xmltodict library for this.

$pip install xmltodict

import json
import xmltodict

xml = '''
<annotation>    
    <object>
        <name>class1</name>
        <pose>Unspecified</pose>
        <bndbox>
            <xmin>805</xmin>
            <ymin>140</ymin>
            <xmax>975</xmax>
            <ymax>300</ymax>
        </bndbox>
    </object>
    <object>
        <name>class2</name>
        <pose>Unspecified</pose>
        <bndbox>
            <xmin>816</xmin>
            <ymin>386</ymin>
            <xmax>1000</xmax>
            <ymax>575</ymax>
        </bndbox>
    </object>
</annotation>
'''

d = xmltodict.parse(xml)

objects = [{'name': 'class1', 'bbox': [813, 141, 964, 296]}, {'name': 'class2', 'bbox': [824, 389, 989, 568]}]

for x in objects:
    for y in d['annotation']['object']:
        if x['name'] == y['name']:
            y['bndbox']['xmin'] = x['bbox'][0]
            y['bndbox']['ymin'] = x['bbox'][1]
            y['bndbox']['xmax'] = x['bbox'][2]
            y['bndbox']['ymax'] = x['bbox'][3]

print(xmltodict.unparse(d, pretty=True))

output:

<?xml version="1.0" encoding="utf-8"?>
<annotation>
    <object>
        <name>class1</name>
        <pose>Unspecified</pose>
        <bndbox>
            <xmin>813</xmin>
            <ymin>141</ymin>
            <xmax>964</xmax>
            <ymax>296</ymax>
        </bndbox>
    </object>
    <object>
        <name>class2</name>
        <pose>Unspecified</pose>
        <bndbox>
            <xmin>824</xmin>
            <ymin>389</ymin>
            <xmax>989</xmax>
            <ymax>568</ymax>
        </bndbox>
    </object>
</annotation>

Upvotes: 0

Related Questions