gansub
gansub

Reputation: 1174

Unable to parse XML response and obtain elements

This is my XML response from a http request

<?xml version="1.0" encoding="UTF-8"?>
<Dataset name="aggregations/g/ds083.2/2/TP"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="http://xml.opendap.org/ns/DAP2"
 xsi:schemaLocation="http://xml.opendap.org/ns/DAP2          
http://xml.opendap.org/dap/dap2.xsd" >

    <Attribute name="NC_GLOBAL" type="Container">
        <Attribute name="Originating_or_generating_Center" type="String">
            <value>US National Weather Service, National Centres for Environmental Prediction (NCEP)</value>
        </Attribute>
        <Attribute name="Originating_or_generating_Subcenter" type="String">
            <value>0</value>
        </Attribute>
        <Attribute name="GRIB_table_version" type="String">
            <value>2,1</value>
        </Attribute>
        <Attribute name="Type_of_generating_process" type="String">
            <value>Forecast</value>
        </Attribute>
        <Attribute name="Analysis_or_forecast_generating_process_identifier_defined_by_originating_centre" type="String">
            <value>Analysis from GDAS (Global Data Assimilation System)</value>
        </Attribute>
        <Attribute name="file_format" type="String">
            <value>GRIB-2</value>
        </Attribute>
        <Attribute name="Conventions" type="String">
            <value>CF-1.6</value>
        </Attribute>
        <Attribute name="history" type="String">
            <value>Read using CDM IOSP GribCollection v3</value>
        </Attribute>
        <Attribute name="featureType" type="String">
            <value>GRID</value>
        </Attribute>
        <Attribute name="_CoordSysBuilder" type="String">
            <value>ucar.nc2.dataset.conv.CF1Convention</value>
        </Attribute>
    </Attribute>

    <Array name="time1">
        <Attribute name="units" type="String">
            <value>Hour since 2007-12-06T12:00:00Z</value>
        </Attribute>
        <Attribute name="standard_name" type="String">
            <value>time</value>
        </Attribute>
        <Attribute name="long_name" type="String">
            <value>GRIB forecast or observation time</value>
        </Attribute>
        <Attribute name="calendar" type="String">
            <value>proleptic_gregorian</value>
        </Attribute>
        <Attribute name="_CoordinateAxisType" type="String">
            <value>Time</value>
        </Attribute>
        <Float64/>
        <dimension name="time1" size="10380"/>
    </Array>

</Dataset>

I am trying to parse this XML content using Python 3.5

from xml.etree import ElementTree

response = requests.get("http://rda.ucar.edu/thredds/dodsC/aggregations/g/ds083.2/2/TP.ddx?time1")

tree = ElementTree.fromstring(response.content)

attr = tree.find("Attribute")
print(attr)

When I print this I get a None. What am I doing wrong? I also want to access the "Array" tag but that also returns None.

Upvotes: 1

Views: 784

Answers (2)

mhawke
mhawke

Reputation: 87134

The XML document uses namespaces so you need to support that in your code. There is an explanation and example code in the etree documentation.

Basically you can do this:

import requests
from xml.etree import ElementTree

response = requests.get('http://rda.ucar.edu/thredds/dodsC/aggregations/g/ds083.2/2/TP.ddx?time1')

tree = ElementTree.fromstring(response.content)

attr = tree.find("{http://xml.opendap.org/ns/DAP2}Attribute")

>>> print(attr)
<Element '{http://xml.opendap.org/ns/DAP2}Attribute' at 0x7f147a292458>

# or declare the namespace like this
ns = {'dap2': 'http://xml.opendap.org/ns/DAP2'}
attr = tree.find("dap2:Attribute", ns)

>>> print(attr)
<Element '{http://xml.opendap.org/ns/DAP2}Attribute' at 0x7f147a292458>

Upvotes: 1

Tryph
Tryph

Reputation: 6219

As stated in the doc, due to the xmlns="http://xml.opendap.org/ns/DAP2"attribute of the Dataset root tag, all tag names you're looking for have to be prefixed by {http://xml.opendap.org/ns/DAP2}.

# should find something
tree.find("{http://xml.opendap.org/ns/DAP2}Attribute")

Reading this section of the ElementTree doc will also show you how to make something more readable with a dictionnary of namespaces.

Upvotes: 2

Related Questions