andrebruton
andrebruton

Reputation: 2386

Getting multiple child values from XML doc using Python

I'm reading XML METAR (Weather) Data using Python. I can read the data, and have also added error checking (only for visibility_statute_mi below!). Here is an example of the XML data:

<METAR>
  <raw_text>
  FALE 201800Z VRB01KT 9999 FEW016 BKN028 23/22 Q1010 NOSIG
  </raw_text>
  <station_id>FALE</station_id>
  <observation_time>2013-01-20T18:00:00Z</observation_time>
  <temp_c>23.0</temp_c>
  <dewpoint_c>22.0</dewpoint_c>
  <wind_dir_degrees>0</wind_dir_degrees>
  <wind_speed_kt>1</wind_speed_kt>
  <altim_in_hg>29.822834</altim_in_hg>
  <quality_control_flags>
  <no_signal>TRUE</no_signal>
  </quality_control_flags>
  <sky_condition sky_cover="FEW" cloud_base_ft_agl="1600"/>
  <sky_condition sky_cover="BKN" cloud_base_ft_agl="2800"/>
  <flight_category>MVFR</flight_category>
  <metar_type>METAR</metar_type>
</METAR>

Here is my Python 2.7 code to parse the data:

# Output the XML in a HTML friendly manner
def outputHTML(xml):
    # The get the METAR Data list
    metar_data = xml.getElementsByTagName("data")

    # Our return string
    outputString = ""

    # Cycled through the metar_data
    for state in metar_data:

        #Gets the stations and cycle through them
        stations = state.getElementsByTagName("METAR")
        for station in stations:
            # Grab data from the station element
            raw_text                = station.getElementsByTagName("raw_text")[0].firstChild.data
            station_id              = station.getElementsByTagName("station_id")[0].firstChild.data
            observation_time        = station.getElementsByTagName('observation_time')[0].firstChild.data
            temp_c                  = station.getElementsByTagName('temp_c')[0].firstChild.data
            dewpoint_c              = station.getElementsByTagName('dewpoint_c')[0].firstChild.data
            wind_dir_degrees        = station.getElementsByTagName('wind_dir_degrees')[0].firstChild.data
            wind_speed_kt           = station.getElementsByTagName('wind_speed_kt')[0].firstChild.data
            visibility_statute_mi   = station.getElementsByTagName('visibility_statute_mi')
            if len(visibility_statute_mi) > 0:
              visibility_statute_mi = visibility_statute_mi[0].firstChild.data
            altim_in_hg             = station.getElementsByTagName('altim_in_hg')[0].firstChild.data
            metar_type              = station.getElementsByTagName('metar_type')[0].firstChild.data

            # Append the data onto the string
            string = "<tr><td>" + str(station_id) + "</td><td>" + str(observation_time) + "</td><td>" + str(raw_text) + "</td><td>" + str(temp_c) + "</td><td>" + str(dewpoint_c) + "</td></tr>"
            outputString+=string

    # Output string
    return outputString    

How do I read the sky_condition data and loop to get the sky_cover and cloud_base_ft_agl values?

I'll also need to check if there are any sky-condition values, because quite often there is no cloud cover and then no data.

Andre

Upvotes: 0

Views: 433

Answers (1)

kr1
kr1

Reputation: 7485

I would parse the xml into a tree and query it, e.g. like this:

import xml.etree.ElementTree as et

xmltext = """
<METAR>
  <raw_text>
  FALE 201800Z VRB01KT 9999 FEW016 BKN028 23/22 Q1010 NOSIG
  </raw_text>
  <station_id>FALE</station_id>
  <observation_time>2013-01-20T18:00:00Z</observation_time>
  <temp_c>23.0</temp_c>
  <dewpoint_c>22.0</dewpoint_c>
  <wind_dir_degrees>0</wind_dir_degrees>
  <wind_speed_kt>1</wind_speed_kt>
  <altim_in_hg>29.822834</altim_in_hg>
  <quality_control_flags>
  <no_signal>TRUE</no_signal>
  </quality_control_flags>
  <sky_condition sky_cover="FEW" cloud_base_ft_agl="1600"/>
  <sky_condition sky_cover="BKN" cloud_base_ft_agl="2800"/>
  <flight_category>MVFR</flight_category>
  <metar_type>METAR</metar_type>
</METAR>
"""
tree = et.fromstring(xmltext)

for sky_con in tree.iterfind('sky_condition'):
    print sky_con.attrib["cloud_base_ft_agl"]
    print sky_con.attrib.keys()

by reading the keys() you can check the presence of the attribute you're interested in.

edit: if you want to use xml.dom.minidom you can add these lines to your stations-loop to extract the same attributes:

for sky_con in station.getElementsByTagName("sky_condition"):
    print sky_con._attrs["cloud_base_ft_agl"].value
    print sky_con._attrs["sky_cover"].value

Upvotes: 3

Related Questions