Ragnar Lothbrok
Ragnar Lothbrok

Reputation: 1135

Editing Items in an XML File in Python

I'm trying to take data from a .csv file and create individual .xml files for each row. I've read the .csv into Pandas already. Where I'm struggling is trying to figure out how to make edits in .xml files.

I'm using this previous answer as a guide to try to learn this:

Link

Applying the author's solution to my data would look something like this:

data = """<annotation>
    <folder>VOC2007</folder>
    <filename>abc.jpg</filename>
    <object>
        <name>blah</name>
        <pose>unknown</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>0</xmin>
            <ymin>0</ymin>
            <xmax>0</xmax>
            <ymax>0</ymax>
        </bndbox>
    </object>
</annotation>
"""

Then I do this:

tree = et.fromstring(data)

Where I'm stuck is the next part. The author edits their file with this line of code:

for data in tree.findall("data"):
    name = data.attrib["name"]
    value = data.find("value")
    value.text = "[%s] %s" % (name, value.text)

I try to apply it to my own like this:

for data in tree.findall("data"):  
    filename = data.find("filename")
    filename.text = "001.jpg"

But this doesn't seem to change anything when I print it out.

print(et.tostring(tree))

What am I doing wrong or what steps do I need to take to edit the name of the image from 'abc.jpg' to '001.jpg'?

Also trying to figure out how to change the values for the four items xmin, ymin, xmax, and ymax.

Upvotes: 0

Views: 263

Answers (2)

Tarje Bargheer
Tarje Bargheer

Reputation: 175

My preference lies in using xmltodict. But from the link you have posted, it seems you are wanting to make the .find("filename") from within the tag and not a tag (which isn't present in your xml-data as is also stated in a comment).

That is, your code could be changed "minimally" (I don't know ElementTree well enough to say what the best solution is) to something like:

for annotation in tree.findall("annotation")
    filename = annotation.find("filename")
    filename.text = "001.jpg"

Upvotes: 0

Laurent LAPORTE
Laurent LAPORTE

Reputation: 22952

I make the assumption you read your CSV file and extract a collection of dictionary-like records, for instance:

record = {
    'folder': "VOC2007",
    'filename': "abc.jpg",
    'name': "blah",
    'pose': "unknown",
    'truncated': "0",
    'difficult': "0",
    'xmin': "0",
    'ymin': "0",
    'xmax': "0",
    'ymax': "0",
}

A simple thing you can do is to use a string template to generate your XML content (since it is very simple):

import textwrap

template = textwrap.dedent("""\
<annotation>
    <folder>{folder}</folder>
    <filename>{filename}</filename>
    <object>
        <name>{name}</name>
        <pose>{pose}</pose>
        <truncated>{truncated}</truncated>
        <difficult>{difficult}</difficult>
        <bndbox>
            <xmin>{xmin}</xmin>
            <ymin>{ymin}</ymin>
            <xmax>{xmax}</xmax>
            <ymax>{ymax}</ymax>
        </bndbox>
    </object>
</annotation>""")

To generate your XML content you can do:

from xml.sax.saxutils import escape

escaped = {k: escape(v) for k, v in record.items()}
data = template.format(**escaped)

The function xml.sax.saxutils.escape is used to convert “<“, “>” and “&” into XML entities.

The result is:

<annotation>
    <folder>VOC2007</folder>
    <filename>abc.jpg</filename>
    <object>
        <name>blah</name>
        <pose>unknown</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>0</xmin>
            <ymin>0</ymin>
            <xmax>0</xmax>
            <ymax>0</ymax>
        </bndbox>
    </object>
</annotation>

Upvotes: 1

Related Questions