Reputation: 67
I'm new at Python and I'm trying to use a XML file. I know how to parse and search information knowing the structure but I don't knows how to search a value without knowing the tag this value is attached to.
for example :
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>TRUE</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>TRUE</year>
<price>39.95</price>
</book>
<adventure>
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>TRUE</year>
<price>TRUE</price>
</adventure>
</bookstore>
In this example, I would like to find all "TRUE" values an replace this value to "OK". How would you do that ?
Thank you
Upvotes: 0
Views: 2011
Reputation: 67
Here what I have done and allow me to find all values in my xml file.
for node in root.iter():
if (node.text != None):
node.text = search_in_dictonary_foot(">"+node.text+"<")
Upvotes: 0
Reputation: 473873
Here's an option using xml.etree.ElementTree
from standard library:
import xml.etree.ElementTree as ET
data = """xml here"""
tree = ET.fromstring(data)
for element in tree.getiterator():
if element.text == 'TRUE':
element.text = 'OK'
print ET.tostring(tree)
Prints:
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>OK</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>OK</year>
<price>39.95</price>
</book>
<adventure>
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>OK</year>
<price>OK</price>
</adventure>
</bookstore>
Upvotes: 1
Reputation: 8326
If the word TRUE
only exists between tags, you should be able to use a simple string replace
my_xml = """
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>TRUE</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>TRUE</year>
<price>39.95</price>
</book>
</bookstore>
"""
>>> my_xml.replace(">TRUE<",">OK<")
'\n<bookstore>\n <book category="COOKING">\n <title lang="en">Everyday Italian</title>\n <author>OK</author>\n <year>2005</year>\n <price>30.00</price>\n</book>\n <book category="CHILDREN">\n <title lang="en">Harry Potter</title>\n <author>J K. Rowling</author>\n <year>2005</year>\n <price>29.99</price>\n</book>\n<book category="WEB">\n <title lang="en">Learning XML</title>\n <author>Erik T. Ray</author>\n <year>OK</year>\n <price>39.95</price>\n </book>\n</bookstore>\n'
>>>
Definitely not as robust as using the xml lib, but should get the job done.
Upvotes: 0