user3848394
user3848394

Reputation: 67

Python search and replace text (value of a tag) in a XML file without knowing the tag

I'm new at Python and I'm trying to use a XML file. I know how to parse and search information knowing the structure but I don't knows how to search a value without knowing the tag this value is attached to.

for example :

<bookstore>
  <book category="COOKING">
  <title lang="en">Everyday Italian</title>
  <author>TRUE</author>
  <year>2005</year>
  <price>30.00</price>
</book>
  <book category="CHILDREN">
  <title lang="en">Harry Potter</title>
  <author>J K. Rowling</author>
  <year>2005</year>
  <price>29.99</price>
</book>
<book category="WEB">
  <title lang="en">Learning XML</title>
  <author>Erik T. Ray</author>
  <year>TRUE</year>
  <price>39.95</price>
  </book>
<adventure>
  <title lang="en">Learning XML</title>
  <author>Erik T. Ray</author>
  <year>TRUE</year>
  <price>TRUE</price>
</adventure>
</bookstore>

In this example, I would like to find all "TRUE" values an replace this value to "OK". How would you do that ?

Thank you

Upvotes: 0

Views: 2011

Answers (3)

user3848394
user3848394

Reputation: 67

Here what I have done and allow me to find all values in my xml file.

for node in root.iter():
        if (node.text != None):
            node.text = search_in_dictonary_foot(">"+node.text+"<")

Upvotes: 0

alecxe
alecxe

Reputation: 473873

Here's an option using xml.etree.ElementTree from standard library:

import xml.etree.ElementTree as ET

data = """xml here"""

tree = ET.fromstring(data)     
for element in tree.getiterator():
    if element.text == 'TRUE': 
        element.text = 'OK'    

print ET.tostring(tree)   

Prints:

<bookstore>
  <book category="COOKING">
  <title lang="en">Everyday Italian</title>
  <author>OK</author>
  <year>2005</year>
  <price>30.00</price>
</book>
  <book category="CHILDREN">
  <title lang="en">Harry Potter</title>
  <author>J K. Rowling</author>
  <year>2005</year>
  <price>29.99</price>
</book>
<book category="WEB">
  <title lang="en">Learning XML</title>
  <author>Erik T. Ray</author>
  <year>OK</year>
  <price>39.95</price>
  </book>
<adventure>
  <title lang="en">Learning XML</title>
  <author>Erik T. Ray</author>
  <year>OK</year>
  <price>OK</price>
</adventure>
</bookstore>

Upvotes: 1

C.B.
C.B.

Reputation: 8326

If the word TRUE only exists between tags, you should be able to use a simple string replace

my_xml = """
<bookstore>
  <book category="COOKING">
  <title lang="en">Everyday Italian</title>
  <author>TRUE</author>
  <year>2005</year>
  <price>30.00</price>
</book>
  <book category="CHILDREN">
  <title lang="en">Harry Potter</title>
  <author>J K. Rowling</author>
  <year>2005</year>
  <price>29.99</price>
</book>
<book category="WEB">
  <title lang="en">Learning XML</title>
  <author>Erik T. Ray</author>
  <year>TRUE</year>
  <price>39.95</price>
  </book>
</bookstore>
"""
>>> my_xml.replace(">TRUE<",">OK<")
'\n<bookstore>\n  <book category="COOKING">\n  <title lang="en">Everyday Italian</title>\n  <author>OK</author>\n  <year>2005</year>\n  <price>30.00</price>\n</book>\n  <book category="CHILDREN">\n  <title lang="en">Harry Potter</title>\n  <author>J K. Rowling</author>\n  <year>2005</year>\n  <price>29.99</price>\n</book>\n<book category="WEB">\n  <title lang="en">Learning XML</title>\n  <author>Erik T. Ray</author>\n  <year>OK</year>\n  <price>39.95</price>\n  </book>\n</bookstore>\n'
>>> 

Definitely not as robust as using the xml lib, but should get the job done.

Upvotes: 0

Related Questions