Nick H
Nick H

Reputation: 123

Updating XML elements and attribute values using Python etree

I'm trying to use Python 2.7's ElementTree library to parse an XML file, then replace specific element attributes with test data, then save this as a unique XML file.

My idea for a solution was to (1) source new data from a CSV file by reading a file to a string, (2) slice the string at certain delimiter marks, (3) append to a list, and then (4) use ElementTree to update/delete/replace the attribute with a specific value from the list.

I've looked in the ElementTree documentation & saw the clear() and remove() functions, but I have no idea of the syntax to use them adequately.

An example of the XML to modify is below - attributes with XXXXX are to be replaced/updated:

<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="XXXXX">
        <Pty ID="XXXXX" R="1"/>
    </RptSide>
</TrdCaptRpt>

The intended result will be, for example:

<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="12345">
        <Pty ID="ABCDE" R="1"/>
    </RptSide>
</TrdCaptRpt>

How do I use the etree commands to change the base XML to update with an item from the list[]?

Upvotes: 10

Views: 27738

Answers (1)

jcollado
jcollado

Reputation: 40384

For this kind of work, I always recommend BeautifulSoup because it has a really easy to learn API:

from BeautifulSoup import BeautifulStoneSoup as Soup

xml = """
<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="XXXXX">
        <Pty ID="XXXXX" R="1"/>
    </RptSide>
</TrdCaptRpt>
"""

soup = Soup(xml)
rpt_side = soup.trdcaptrpt.rptside
rpt_side['txt1'] = 'Updated'
rpt_side.pty['id'] = 'Updated'

print soup

Example output:

<trdcaptrpt rptid="10000001" transtyp="0">
<rptside side="1" txt1="Updated">
<pty id="Updated" r="1">
</pty></rptside>
</trdcaptrpt>

Edit: With xml.etree.ElementTree you could use the following script:

from xml.etree import ElementTree as etree

xml = """
<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="XXXXX">
        <Pty ID="XXXXX" R="1"/>
    </RptSide>
</TrdCaptRpt>
"""

root = etree.fromstring(xml)
rpt_side = root.find('RptSide')
rpt_side.set('Txt1', 'Updated')
pty = rpt_side.find('Pty')
pty.set('ID', 'Updated')
print etree.tostring(root)

Example output:

<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="Updated">
        <Pty ID="Updated" R="1" />
    </RptSide>
</TrdCaptRpt>

Upvotes: 18

Related Questions