Reputation: 65
I want to replace all the strings (Except the image filename, only change those in the name tags) 'bicycle' in the xml file with 'bike'. I wanted to do with re.sub by using .readlines(), but that's not working. Can anyone advise how can I do that in the most efficient way (A good explanation will be of much help)?
<annotation>
<folder>images</folder>
<filename>bicycle (10).jpg</filename>
<path>C:\Users\Merida\Desktop\Bicycle\images\bicycle (10).jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>960</width>
<height>636</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>bicycle</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>68</xmin>
<ymin>24</ymin>
<xmax>755</xmax>
<ymax>632</ymax>
</bndbox>
</object>
<object>
<name>bicycle</name>
<pose>Unspecified</pose>
<truncated>1</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>1</xmin>
<ymin>28</ymin>
<xmax>189</xmax>
<ymax>435</ymax>
</bndbox>
</object>
</annotation>
Upvotes: 2
Views: 2507
Reputation: 22177
Please try the following XSLT based solution.
The XSLT is following a so called Identity Transform pattern.
It will modify <name>
element values from 'bicycle' to 'bike', leaving everything else intact.
Input XML
<?xml version="1.0"?>
<annotation>
<folder>images</folder>
<filename>bicycle (10).jpg</filename>
<path>C:\Users\Merida\Desktop\Bicycle\images\bicycle (10).jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>960</width>
<height>636</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>bicycle</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>68</xmin>
<ymin>24</ymin>
<xmax>755</xmax>
<ymax>632</ymax>
</bndbox>
</object>
<object>
<name>bicycle</name>
<pose>Unspecified</pose>
<truncated>1</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>1</xmin>
<ymin>28</ymin>
<xmax>189</xmax>
<ymax>435</ymax>
</bndbox>
</object>
</annotation>
XSLT
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="name[.='bicycle']">
<xsl:copy>bike</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Output XML
<annotation>
<folder>images</folder>
<filename>bicycle (10).jpg</filename>
<path>C:\Users\Merida\Desktop\Bicycle\images\bicycle (10).jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>960</width>
<height>636</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>bike</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>68</xmin>
<ymin>24</ymin>
<xmax>755</xmax>
<ymax>632</ymax>
</bndbox>
</object>
<object>
<name>bike</name>
<pose>Unspecified</pose>
<truncated>1</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>1</xmin>
<ymin>28</ymin>
<xmax>189</xmax>
<ymax>435</ymax>
</bndbox>
</object>
</annotation>
Upvotes: 1
Reputation: 12701
If you want to replace ALL occurrences of "bicycle" it can be easily done with 'replace':
input_file = "example.xml"
output_file = "output.xml"
with open(input_file) as f:
xml_content = f.readlines()
with open(output_file, 'w+') as f:
for line in xml_content:
f.write(line.replace('bicycle', 'bike'))
However, if you want to keep the structure of your xml intact (in case an element or attribute name would be bicycle) you might wanna take a look at elementTree or lxml.
Edit: after the edit of your question here a cleaner solution with elementTree:
import xml.etree.ElementTree as ET
input_file = "example.xml"
output_file = "output.xml"
tree = ET.parse(input_file)
root = tree.getroot()
name_elts = root.findall(".//name") # we find all 'name' elements
for elt in name_elts:
elt.text = elt.text.replace("bicycle", "bike")
tree.write(output_file)
Upvotes: 2
Reputation: 43
This was my approach, this will replace all instances of "bicycle" with "bike". This will also change "bicycle" in the path that you specified, which I think is what you were looking for. Also "text.xml" would need to be replaced with the name of the file you used
# Open file containing xml text and copy contents to string
f = open("test.xml", "r+")
xmlText = f.read()
# Bring pointer back to start of file and delete all contents
f.seek(0)
f.truncate()
# Replace all instances of bicycle with bike
newText = xmlText.replace("bicycle", "bike")
# Write this new text with replaced words to the file and close
f.write(newText)
f.close()
Upvotes: 2