maurobio
maurobio

Reputation: 1577

Getting several elements from XML document using Python lxml

From the XML document below:

   <ns:getCommonNamesFromTSNResponse xmlns:ns="http://itis_service.itis.usgs.gov">
    <ns:return xmlns:ax21="http://data.itis_service.itis.usgs.gov/xsd" xmlns:ax23="http://metadata.itis_service.itis.usgs.gov/xsd" xmlns:ax26="http://itis_service.itis.usgs.gov/xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ax21:SvcCommonNameList">
    <ax21:tsn>183833</ax21:tsn>
    <ax21:commonNames xsi:type="ax21:SvcCommonName">
    <ax21:commonName>African hunting dog</ax21:commonName>
    <ax21:language>English</ax21:language>
    <ax21:tsn>183833</ax21:tsn></ax21:commonNames>
    <ax21:commonNames xsi:type="ax21:SvcCommonName">
    <ax21:commonName>African Wild Dog</ax21:commonName>
    <ax21:language>English</ax21:language>
    <ax21:tsn>183833</ax21:tsn></ax21:commonNames>
    <ax21:commonNames xsi:type="ax21:SvcCommonName">
    <ax21:commonName>Painted Hunting Dog</ax21:commonName>
    <ax21:language>English</ax21:language>
    <ax21:tsn>183833</ax21:tsn>
    </ax21:commonNames>
    </ns:return>
    </ns:getCommonNamesFromTSNResponse>

I want to get all the values of the "commonName" and "language" elements, using Python lxml library.

I tried this code:

import lxml.etree as ET
tree = ET.parse("names.xml")
namespaces = {'ax21': 'http://data.itis_service.itis.usgs.gov/xsd'} 
common_names = tree.findall(".//ax21:commonNames:ax21:commonName", namespaces)
langs = tree.findall(".//ax21:commonNames:ax21:language", namespaces)

but it returns just empty lists.

Any hints?

Upvotes: 0

Views: 39

Answers (1)

Ajay
Ajay

Reputation: 5347

Case1: Getting each tag separately with find

lan = tree.find('.//ax21:language', namespaces)
cn = tree.find('.//ax21:commonName', namespaces)
print(lan.text)
print(cn.text)

Output:

English

African hunting dog

if you need all of them

langs = tree.findall(".//ax21:commonName", namespaces)
[i.text for i in langs]
['African hunting dog', 'African Wild Dog', 'Painted Hunting Dog']

if you need both of them together then we can use [\[xpath]]1`2

a=tree.xpath('.//ax21:language |.//ax21:commonName',namespaces= {'ax21': 'http://data.itis_service.itis.usgs.gov/xsd'} )
[i.text for i in a]

Output:
['African hunting dog',
 'English',
 'African Wild Dog',
 'English',
 'Painted Hunting Dog',
 'English']

In the last case simply giving namesapce variable in xpath is not sufficient we should give it in namespaces= {'ax21': 'http://data.itis_service.itis.usgs.gov/xsd'} format

Upvotes: 1

Related Questions