Reputation: 13
I'm trying to parse the following text from the XML
title_text = word1 Word2 word3 word4
The problem is that with the code below I'm getting title_text = 'word1'
.
How can I achieve that?
XML:
<response>...<results>...<grouping>...<group>...
<doc>...
<title>
word1
<hlword>Word2</hlword>
<hlword>word3</hlword>
word4
</title>
...
</doc>
</group>...</grouping>...</results>...</response>...
Code for parse:
from lxml import objectify
...
tree = objectify.fromstring(xml)
nodes = tree.response.results.grouping.group
for node in nodes:
title_element = node.doc.title
title_text = title_element.text
print title_text
Upvotes: 0
Views: 36
Reputation: 174622
Just iterate over .itertext()
:
>>> for node in nodes:
... print(' '.join(node.doc.title.itertext()))
...
word1 word2 word3 word4
Upvotes: 1