Reputation: 4421
My xml element looks like this:
<para>Lorem ipsum (R<inf>0</inf>) dolor sit amnet</para>
Trying to get the entire text with
import xml.etree.ElementTree as ET
xml = ET.fromstring('<para>Lorem ipsum (R<inf>0</inf>) dolor sit amnet</para>')
xml.text
results in 'Lorem ipsum (R'
. Hence, the part after <inf>
is completely ignored. How can I make the xml parser ignore/delete this element?
Upvotes: 0
Views: 160
Reputation: 4421
The solution is plain and simple: Join the elements returned from .itertext()
:
import xml.etree.ElementTree as ET
xml = ET.fromstring('<para>Lorem ipsum (R<inf>0</inf>) dolor sit amnet</para>')
''.join(xml.itertext())
Credits go out to Jon Clements.
Upvotes: 2