user3041764
user3041764

Reputation: 849

Parsing XML in Python using ElementTree

The answers were given hundreds of times, but still I could not find a solution. I tried the official documentation and answers on stackoverflow.

I have that XML structure:

<?xml version="1.0" encoding="windows-1252"?>
<OpenShipments xmlns="x-schema:OpenShipments.xdr">
    <OpenShipment ProcessStatus="Processed" ShipmentOption="">
        <ShipTo>
            <CompanyOrName><![CDATA[xxx]]></CompanyOrName>
            <Attention><![CDATA[xxx]]></Attention>
            <Address1><![CDATA[xxx]]></Address1>
            <PostalCode><![CDATA[xxx]]></PostalCode>
            <CityOrTown><![CDATA[xxx]]></CityOrTown>
            <Telephone><![CDATA[xxx]]></Telephone>
            <EmailAddress><![CDATA[xxx]]></EmailAddress>
            <CountryTerritory><![CDATA[xxx]]></CountryTerritory>
        </ShipTo>
        <ShipmentInformation>
            <ServiceType>ST</ServiceType>
            <PackageType>CP</PackageType>
            <ShipmentActualWeight><![CDATA[XXX]]></ShipmentActualWeight>
            <QVNOption>
                <QVNRecipientAndNotificationTypes>
                    <CompanyOrName/>
                    <ContactName/>
                    <EMailAddress/>
                    <LabelCreation/>
                </QVNRecipientAndNotificationTypes>
                <ShipFromCompanyOrName>xxx</ShipFromCompanyOrName>
            </QVNOption>
        </ShipmentInformation>
        <ProcessMessage>

            <ShipmentRates>
                <ShipmentCharges>
                    <Rate>
                        <Published>XXX</Published>
                        <Negotiated>XXX</Negotiated>
                    </Rate>
                </ShipmentCharges>
                <ShipperCharges>
                    <Rate>
                        <Published>XXX</Published>
                        <Negotiated>XXX</Negotiated>
                    </Rate>
                </ShipperCharges>
                <ReceiverCharges>
                    <Rate>
                        <Published>0,00</Published>
                        <Negotiated>0,00</Negotiated>
                    </Rate>
                </ReceiverCharges>
                <QVN>
                    <Rate>
                        <Published>0,00</Published>
                        <Negotiated>0,00</Negotiated>
                    </Rate>
                </QVN>
                <PackageRates>
                    <PackageRate>
                        <TrackingNumber>TRACKING NUMBER</TrackingNumber>
                        <PackageCharges>
                            <Rate>
                            <Published>0,00</Published>
                            <Negotiated>0,00</Negotiated>
                            </Rate>
                        </PackageCharges>
                        <Delivery_AreaSurcharge>
                            <Rate>
                            <Published>0,00</Published>
                            <Negotiated>0,00</Negotiated>
                            </Rate>
                        </Delivery_AreaSurcharge>
                    </PackageRate>
                </PackageRates>
            </ShipmentRates>
            <TrackingNumbers>
                <TrackingNumber>TRACKING NUMBER</TrackingNumber>
            </TrackingNumbers>
            <ShipID>XXX</ShipID>
            <ImportID></ImportID>
            <Reference1></Reference1>
            <Reference2></Reference2>
        <ShipmentID></ShipmentID>
        <PRONumber></PRONumber>
        </ProcessMessage>
    </OpenShipment>
</OpenShipments>

A need to get to "TrackingNumber" value. I tried findall() and find() functions but with no result.

import xml.etree.ElementTree as ET
import pprint

tree = ET.parse('file.out')
root = tree.getroot()

print root.findall('TrackingNumber')
# []
print root.find('TrackingNumber')
# None

ElementTree had to make that access to XML elements will be simple and but this proved too difficult for me.

Upvotes: 2

Views: 128

Answers (1)

Padraic Cunningham
Padraic Cunningham

Reputation: 180502

You need a namespace mapping:

from xml.etree import ElementTree as et

xm = et.fromstring(x)
ns = {"op": 'x-schema:OpenShipments.xdr'}
print(xm.findall('.//op:TrackingNumber',ns))

which will give you something like:

[<Element '{x-schema:OpenShipments.xdr}TrackingNumber' at 0x7fa210579550>, <Element '{x-schema:OpenShipments.xdr}TrackingNumber' at 0x7fa210579910>]

Upvotes: 2

Related Questions