Reputation: 1135
I'm trying to take data from a .csv file and create individual .xml files for each row. I've read the .csv into Pandas already. Where I'm struggling is trying to figure out how to make edits in .xml files.
I'm using this previous answer as a guide to try to learn this:
Applying the author's solution to my data would look something like this:
data = """<annotation>
<folder>VOC2007</folder>
<filename>abc.jpg</filename>
<object>
<name>blah</name>
<pose>unknown</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>0</xmin>
<ymin>0</ymin>
<xmax>0</xmax>
<ymax>0</ymax>
</bndbox>
</object>
</annotation>
"""
Then I do this:
tree = et.fromstring(data)
Where I'm stuck is the next part. The author edits their file with this line of code:
for data in tree.findall("data"):
name = data.attrib["name"]
value = data.find("value")
value.text = "[%s] %s" % (name, value.text)
I try to apply it to my own like this:
for data in tree.findall("data"):
filename = data.find("filename")
filename.text = "001.jpg"
But this doesn't seem to change anything when I print it out.
print(et.tostring(tree))
What am I doing wrong or what steps do I need to take to edit the name of the image from 'abc.jpg' to '001.jpg'?
Also trying to figure out how to change the values for the four items xmin, ymin, xmax, and ymax.
Upvotes: 0
Views: 263
Reputation: 175
My preference lies in using xmltodict. But from the link you have posted, it seems you are wanting to make the .find("filename") from within the tag and not a tag (which isn't present in your xml-data as is also stated in a comment).
That is, your code could be changed "minimally" (I don't know ElementTree well enough to say what the best solution is) to something like:
for annotation in tree.findall("annotation")
filename = annotation.find("filename")
filename.text = "001.jpg"
Upvotes: 0
Reputation: 22952
I make the assumption you read your CSV file and extract a collection of dictionary-like records, for instance:
record = {
'folder': "VOC2007",
'filename': "abc.jpg",
'name': "blah",
'pose': "unknown",
'truncated': "0",
'difficult': "0",
'xmin': "0",
'ymin': "0",
'xmax': "0",
'ymax': "0",
}
A simple thing you can do is to use a string template to generate your XML content (since it is very simple):
import textwrap
template = textwrap.dedent("""\
<annotation>
<folder>{folder}</folder>
<filename>{filename}</filename>
<object>
<name>{name}</name>
<pose>{pose}</pose>
<truncated>{truncated}</truncated>
<difficult>{difficult}</difficult>
<bndbox>
<xmin>{xmin}</xmin>
<ymin>{ymin}</ymin>
<xmax>{xmax}</xmax>
<ymax>{ymax}</ymax>
</bndbox>
</object>
</annotation>""")
To generate your XML content you can do:
from xml.sax.saxutils import escape
escaped = {k: escape(v) for k, v in record.items()}
data = template.format(**escaped)
The function xml.sax.saxutils.escape
is used to convert “<“, “>” and “&” into XML entities.
The result is:
<annotation>
<folder>VOC2007</folder>
<filename>abc.jpg</filename>
<object>
<name>blah</name>
<pose>unknown</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>0</xmin>
<ymin>0</ymin>
<xmax>0</xmax>
<ymax>0</ymax>
</bndbox>
</object>
</annotation>
Upvotes: 1