Reputation: 7198
I have below xml file data:
<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<rootnode>
<TExportCarcass>
<BodyNum>6168</BodyNum>
<BodyWeight>331.40</BodyWeight>
<UnitID>1</UnitID>
<Plant>239</Plant>
<pieces>
<TExportCarcassPiece index="0">
<Bruising>0</Bruising>
<RFIDPlant></RFIDPlant>
</TExportCarcassPiece>
<TExportCarcassPiece index="1">
<Bruising>0</Bruising>
<RFIDPlant></RFIDPlant>
</TExportCarcassPiece>
</pieces>
</TExportCarcass>
<TExportCarcass>
<BodyNum>6169</BodyNum>
<BodyWeight>334.40</BodyWeight>
<UnitID>1</UnitID>
<Plant>278</Plant>
<pieces>
<TExportCarcassPiece index="0">
<Bruising>0</Bruising>
<RFIDPlant></RFIDPlant>
</TExportCarcassPiece>
<TExportCarcassPiece index="1">
<Bruising>0</Bruising>
<RFIDPlant></RFIDPlant>
</TExportCarcassPiece>
</pieces>
</TExportCarcass>
</rootnode>
I am using python's lxml
module to read data from xml file like below:
from lxml import etree
doc = etree.parse('file.xml')
memoryElem = doc.find('BodyNum')
print(memoryElem)
But its only printing None
instead of 6168
. Please suggest what I am doing wrong here.
Upvotes: 3
Views: 8470
Reputation: 376
1 - Use /
to specify the tree level of the element you want to extract
2 - Use .text
to extract the name of the elemnt
doc = etree.parse('file.xml')
memoryElem = doc.find("*/BodyNum") #BodyNum is one level down
print(memoryElem.text) #Specify you want to extract the name of the element
Upvotes: 2
Reputation: 2259
When you run find
on a text string, it will only search for elements at the root level. You can instead use xpath
queries within find
to search for any element within the doc:
from lxml import etree
doc = etree.parse('file.xml')
memoryElem = doc.find('.//BodyNum')
memoryElem.text
# 6168
[ b.text for b in doc.iterfind('.//BodyNum') ]
# ['6168', '6169']
Upvotes: 2
Reputation: 92854
Your document contains multiple BodyNum
elements.
You need to put an explicit limit into a query if you need only the 1st element.
Use the following flexible approach based on xpath
query:
from lxml import etree
doc = etree.parse('file.xml')
memoryElem = doc.xpath('(//BodyNum)[1]/text()')
print(memoryElem) # ['6168']
Upvotes: 0
Reputation: 82755
You need to iterate each TExportCarcass
tag and then use find
to access BodyNum
Ex:
from lxml import etree
doc = etree.parse('file.xml')
for elem in doc.findall('TExportCarcass'):
print(elem.find("BodyNum").text)
Output:
6168
6169
or
print([i.text for i in doc.findall('TExportCarcass/BodyNum')]) #-->['6168', '6169']
Upvotes: 2
Reputation: 627
Just use the inbuild xml.etree.Etree
module of python
https://docs.python.org/3/library/xml.etree.elementtree.html
Upvotes: 0