Sanjay
Sanjay

Reputation: 115

Changing bounding box coordinates in xml file as per new image width and height

I am trying to convert bounding box coordinates in xml file with respect to a new image's width and height. The sample xml file is given below:

<annotations>
 <image height="940" id="3" name="C_00080.jpg" width="1820">
  <box label="Objects" occluded="0" xbr="801.99255" xtl="777.78656" ybr="506.9955" ytl="481.82132">
   <attribute name="Class">B</attribute>
  </box>
  <box label="Objects" occluded="0" xbr="999.319" xtl="963.38654" ybr="519.2735" ytl="486.68628">
   <attribute name="Class">A</attribute>
  </box>
 </image>
<annotations>

Original image width and height in xml is 1820x940 and box coordinates are same. I want to change the box coordinates to a new image's width and height that is 1080x720. I have written this code, can someone help me to verify or tell me a better way for the code below.

import xml.etree.ElementTree as ET

label_file = '1.xml'
tree = ET.parse(label_file)
root = tree.getroot()

for image in root.findall('image'):
    image.attrib['width'] = '1080'  # Original width = 1820
    image.attrib['height'] = '720'  # Original width = 940
    for allBboxes in image.findall('box'):
        xmin = float(allBboxes.attrib['xtl'])
        xminNew = float(xmin / (1820/1080))
        xminNew = float("{:.5f}".format(xminNew))
        allBboxes.attrib['xtl'] = str(xminNew)
        ymin = float(allBboxes.attrib['ytl'])
        yminNew = float(ymin / (940/720))
        yminNew = float("{:.5f}".format(yminNew))
        allBboxes.attrib['ytl'] = str(yminNew)
        xmax = float(allBboxes.attrib['xbr'])
        xmaxNew = float(xmax / (1820/1080))
        xmaxNew = float("{:.5f}".format(xmaxNew))
        allBboxes.attrib['xbr'] = str(xmaxNew)
        ymax = float(allBboxes.attrib['ybr'])
        ymaxNew = float(ymax / (940/720))
        ymaxNew = float("{:.5f}".format(ymaxNew))
        allBboxes.attrib['ybr'] = str(ymaxNew)

tree.write(label_file)

Upvotes: 1

Views: 2440

Answers (2)

Parfait
Parfait

Reputation: 107767

Consider a parameterized XSLT solution using Python's third party module, lxml, where you pass new width and height values from Python to dynamically apply formula to XML attributes.

XSLT (save as .xsl file)

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes" encoding="utf-8"/>
  <xsl:strip-space elements="*"/>

  <!-- PARAMS WITH DEFAULTS -->
  <xsl:param name="new_width" select="1080"/>
  <xsl:param name="new_height" select="720"/>  

  <!-- IDENTITY TRANSFORM -->
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <!-- WIDTH AND HEIGHT ATTRS CHANGE -->
  <xsl:template match="image">
    <xsl:copy>
      <xsl:apply-templates select="@*"/>
      <xsl:attribute name="width"><xsl:value-of select="$new_width"/></xsl:attribute>
      <xsl:attribute name="height"><xsl:value-of select="$new_height"/></xsl:attribute>
      <xsl:apply-templates select="node()"/>
    </xsl:copy>
  </xsl:template>

  <!-- X ATTRS CHANGE -->
  <xsl:template match="box/@xbr|box/@xtl">
      <xsl:variable select="ancestor::image/@width" name="curr_width"/>

      <xsl:attribute name="{name(.)}">
          <xsl:value-of select="format-number(. div ($curr_width div $new_width) , '#.00000')"/>
      </xsl:attribute>
  </xsl:template>

  <!-- Y ATTRS CHANGE -->
  <xsl:template match="box/@ybr|box/@ytl">
      <xsl:variable select="ancestor::image/@height" name="curr_height"/>

      <xsl:attribute name="{name(.)}">
          <xsl:value-of select="format-number(. div ($curr_height div $new_height), '#.00000')"/>
      </xsl:attribute>
  </xsl:template>

</xsl:stylesheet>

Python (no for loop or if logic)

import lxml.etree as et

# LOAD XML AND XSL SCRIPT
xml = et.parse('Input.xml')
xsl = et.parse('Script.xsl')

# PASS PARAMETERS TO XSLT
transform = et.XSLT(xsl)
result = transform(xml, new_width = et.XSLT.strparam(str(1080)), 
                        new_height = et.XSLT.strparam(str(720)))

# SAVE RESULT TO FILE
with open("Output.xml", 'wb') as f:
    f.write(result)

Output

<?xml version="1.0" encoding="utf-8"?>
<annotations>
  <image height="720" id="3" name="C_00080.jpg" width="1080">
    <box label="Objects" occluded="0" xbr="475.90767" xtl="461.54367" ybr="388.33698" ytl="369.05463">
      <attribute name="Class">B</attribute>
    </box>
    <box label="Objects" occluded="0" xbr="593.00248" xtl="571.67992" ybr="397.74140" ytl="372.78098">
      <attribute name="Class">A</attribute>
    </box>
  </image>
</annotations>

Upvotes: 1

Louis Lac
Louis Lac

Reputation: 6436

To improve the code you can:

  • compute the ratios before the loop
  • remove useless float conversions
  • remove the division (division by a division is a multiplication)
  • rounding of the float may not be necessary
  • group the statements in a coherent order
  • rename allBoxes to box as it represents only one box

Here is a possible code:

import xml.etree.ElementTree as ET

label_file = '1.xml'
tree = ET.parse(label_file)
root = tree.getroot()

r_w = 1080 / 1820
r_h = 720 / 940

for image in root.findall('image'):
    image.attrib['width'] = '1080'  # Original width = 1820
    image.attrib['height'] = '720'  # Original width = 940

    for box in image.findall('box'):
        xmin = float(box.attrib['xtl'])
        ymin = float(box.attrib['ytl'])
        xmax = float(box.attrib['xbr'])
        ymax = float(box.attrib['ybr'])

        xminNew = xmin * r_w
        yminNew = ymin * r_h
        xmaxNew = xmax * r_w
        ymaxNew = ymax * r_h

        box.attrib['xtl'] = str(xminNew)
        box.attrib['ytl'] = str(yminNew)
        box.attrib['xbr'] = str(xmaxNew)
        box.attrib['ybr'] = str(ymaxNew)

tree.write(label_file)

You can further improve this code by wrapping all this in functions to improve usability, clarity and possible reuse.

Upvotes: 1

Related Questions