Reputation: 1173
I am trying to parse out certain tags from an XML document and it is retiring an AttributeError: '_ElementStringResult' object has no attribute 'text'
error.
Here is the xml document:
<?xml version='1.0' encoding='ASCII'?>
<Root>
<Data>
<FormType>Log</FormType>
<Submitted>2012-03-19 07:34:07</Submitted>
<ID>1234</ID>
<LAST>SJTK4</LAST>
<Latitude>36.7027777778</Latitude>
<Longitude>-108.046111111</Longitude>
<Speed>0.0</Speed>
</Data>
</Root>
Here is the code I am using
from lxml import etree
from StringIO import StringIO
import MySQLdb
import glob
import os
import shutil
import logging
import sys
localPath = "C:\data"
xmlFiles = glob.glob1(localPath,"*.xml")
for file in xmlFiles:
a = os.path.join(localPath,file)
element = etree.parse(a)
Data = element.xpath('//Root/Data/node()')
parsedData = [{field.tag: field.text for field in Data} for action in Data]
print parsedData #AttributeError: '_ElementStringResult' object has no attribute 'text'
Upvotes: 0
Views: 2043
Reputation: 295272
Instead of querying for //Root/Data/node()
, query for /Root/Data/*
if you want only elements (as opposed to text nodes) to be returned. (Also, using only a single leading /
rather than //
allows the engine to do a cheaper search, rather than needing to look through the whole subtree for an additional Root
.
Also -- are you sure you really want to loop through the entire list of subelements of Data inside your inner loop, rather than looping over only the subelements of a single Data element selected by your outer loop? I think your logic is broken, though it would only be visible if you had a file with more than one Data
element under Root
.
Upvotes: 2
Reputation: 9997
'//Root/Data/node()'
will return a list of all the child elements which include text elements as strings which will not have a text
attribute. If you put a print right after the Data = ...
you will see something like ['\n ', <Element FormType at 0x10675fdc0>, '\n ', ...
.
I would do a filter first such as:
Data = [f for f in elem.xpath('//Root/Data/node()') if hasattr(f, 'text')]
Then I think the following line could be rewritten as:
parsedData = {field.tag: field.text for field in Data}
which will give the element tag and text dictionary which I believe is what you want.
Upvotes: 2