daydreamer
daydreamer

Reputation: 91999

Python Manipulate and save XML without third-party libraries

I have a xml in which I have to search for a tag and replace the value of tag with a new values. For example,

<tag-Name>oldName</tag-Name>  

and replace oldName to newName like

<tag-Name>newName</tag-Name>    

and save the xml. How can I do that without using thirdparty lib like BeautifulSoup

Thank you

Upvotes: 1

Views: 5392

Answers (3)

Steven
Steven

Reputation: 28666

The best option from the standard lib is (I think) the xml.etree package.

Assuming that your example tag occurs only once somewhere in the document:

import xml.etree.ElementTree as etree
# or for a faster C implementation
# import xml.etree.cElementTree as etree

tree = etree.parse('input.xml')
elem = tree.find('//tag-Name') # finds the first occurrence of element tag-Name
elem.text = 'newName'
tree.write('output.xml')

Or if there are multiple occurrences of tag-Name, and you want to change them all if they have "oldName" as content:

import xml.etree.cElementTree as etree

tree = etree.parse('input.xml')
for elem in tree.findall('//tag-Name'):
    if elem.text == 'oldName':
        elem.text = 'newName'
# some output options for example
tree.write('output.xml', encoding='utf-8', xml_declaration=True)

Upvotes: 6

Kirk Strauser
Kirk Strauser

Reputation: 30947

If you are certain, I mean, completely 100% positive that the string <tag-Name> will never appear inside that tag and the XML will always be formatted like that, you can always use good old string manipulation tricks like:

xmlstring = xmlstring.replace('<tag-Name>oldName</tag-Name>', '<tag-Name>newName</tag-Name>')

If the XML isn't always as conveniently formatted as <tag>value</tag>, you could write something like:

a = """<tag-Name>


oldName


    </tag-Name>"""

def replacetagvalue(xmlstring, tag, oldvalue, newvalue):
    start = 0
    while True:
        try:
            start = xmlstring.index('<%s>' % tag, start) + 2 + len(tag)
        except ValueError:
            break
        end = xmlstring.index('</%s>' % tag, start)
        value = xmlstring[start:end].strip()
        if value == oldvalue:
            xmlstring = xmlstring[:start] + newvalue + xmlstring[end:]
    return xmlstring

print replacetagvalue(a, 'tag-Name', 'oldName', 'newName')

Others have already mentioned xml.dom.minidom which is probably a better idea if you're not absolutely certain beyond any doubt that your XML will be so simple. If you can guarantee that, though, remember that XML is just a big chunk of text that you can manipulate as needed.

I use similar tricks in some production code where simple checks like if "somevalue" in htmlpage are much faster and more understandable than invoking BeautifulSoup, lxml, or other *ML libraries.

Upvotes: 0

rplnt
rplnt

Reputation: 2409

Python has 'builtin' libraries for working with xml. For this simple task I'd look into minidom. You can find the docs here:

http://docs.python.org/library/xml.dom.minidom.html

Upvotes: 0

Related Questions