Reputation: 189
This is my XML file
<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<rank>1</rank>
<year>2008</year>
<gdp>141100</gdp>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
</data>
How to pull all the child nodes of country
?
For Example, I need the output as ['rank','year','gdp','neighbor']
Upvotes: 13
Views: 59505
Reputation: 3415
In Python3 you can do, with xml.etree.ElementTree
import xml.etree.ElementTree as ET
root = ET.parse('file.xml').getroot()
for country in root.findall('.//country/*'):
print(country.tag)
# Alternative1 loop the country list of nodes
for country in root.findall('.//country'):
for node in country:
print(node.tag)
# Alternative 2 loop first child of root
for node in root[0]:
print(node.tag)
# For all nodes use iter(), include the found node
for node in root.iter():
print(node.tag)
Upvotes: 0
Reputation: 231
this code tested under python 3.6
import xml.etree.ElementTree as ET
name = '4.xml'
tree = ET.parse(name)
root = tree.getroot()
ditresult =[]
for child in root:
for child1 in child:
ditresult.append(child1.tag)
print (ditresult)
=============
['rank', 'year', 'gdp', 'neighbor', 'neighbor']
Upvotes: 1
Reputation: 6564
Have a look up to python documentation. It verily uses this xml tree as example.
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
country = root[0].getchildren()
map(lambda e: e.tag, r)
# ['rank', 'year', 'gdp', 'neighbor', 'neighbor']
Btw, when you are stuck, open repl and go step by step. I do not remember all those stuff above. And last used xml parser 2 or 3 years ago. But I know, "try and see" is the best teacher.
Those are steps, how I come up with a solution.
# imports and other stuff.
>>> tree = ET.parse('data.xml')
>>> root = tree.getroot()
>>> country = root[0]
>>> dir(country)
['__class__', '__delattr__', '__delitem__', '__dict__', '__doc__', '__format__', '__getattribute__', '__getitem__', '__hash__', '__init__', '__len__', '__module__', '__new__', '__nonzero__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_children', 'append', 'attrib', 'clear', 'copy', 'extend', 'find', 'findall', 'findtext', 'get', 'getchildren', 'getiterator', 'insert', 'items', 'iter', 'iterfind', 'itertext', 'keys', 'makeelement', 'remove', 'set', 'tag', 'tail', 'text']
>>> country.keys()
['name']
>>> country.getchildren()
[<Element 'rank' at 0x7f873cf53910>, <Element 'year' at 0x7f873cf539d0>, <Element 'gdp' at 0x7f873cf53a90>, <Element 'neighbor' at 0x7f873cf53c10>, <Element 'neighbor' at 0x7f873cf53c50>]
>>> country.getchildren()[0]
<Element 'rank' at 0x7f873cf53910>
>>> r = country.getchildren()[0]
>>> dir(r)
['__class__', '__delattr__', '__delitem__', '__dict__', '__doc__', '__format__', '__getattribute__', '__getitem__', '__hash__', '__init__', '__len__', '__module__', '__new__', '__nonzero__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_children', 'append', 'attrib', 'clear', 'copy', 'extend', 'find', 'findall', 'findtext', 'get', 'getchildren', 'getiterator', 'insert', 'items', 'iter', 'iterfind', 'itertext', 'keys', 'makeelement', 'remove', 'set', 'tag', 'tail', 'text']
>>> r.tag
'rank'
>>> r = country.getchildren()[0]
>>> r
<Element 'rank' at 0x7f873cf53910>
>>> r = country.getchildren()
>>> r
[<Element 'rank' at 0x7f873cf53910>, <Element 'year' at 0x7f873cf539d0>, <Element 'gdp' at 0x7f873cf53a90>, <Element 'neighbor' at 0x7f873cf53c10>, <Element 'neighbor' at 0x7f873cf53c50>]
>>> map(lambda e: e.tag, r)
['rank', 'year', 'gdp', 'neighbor', 'neighbor']
Upvotes: 4
Reputation: 1672
Use ElementTree lib to pull out the child nodes. This might help you.
import xml.etree.ElementTree as ET
tree = ET.parse("file.xml")
root = tree.getroot()
for child in root:
print({x.tag for x in root.findall(child.tag+"/*")})
Upvotes: 18
Reputation: 92854
The solution using xml.etree.ElementTree module:
import xml.etree.ElementTree as ET
tree = ET.parse("yourxml.xml")
root = tree.getroot()
tag_names = {t.tag for t in root.findall('.//country/*')}
print(tag_names) # print a set of unique tag names
The output:
{'gdp', 'rank', 'neighbor', 'year'}
'.//country/*'
- xpath expression to extract all child elements of node country
Upvotes: 5