Reputation: 39
<XMLReport><Report>
<Preflight errors="0" criticalfailures="0" noncriticalfailures="0" signoffs="0" fixes="0" warnings="10">
<PreflightResult type="Check" level="warning">
<PreflightResultEntry xml:lang="en-US">
<Message>PDF/X-1a:20000 : PDF/X-1a:20000 output intent is missing </Message>
<StringContext>
<BaseString>PDF/X-1a:20000 : %PDFXVersion% output intent is missing</BaseString>
</StringContext>
</PreflightResultEntry>
</PreflightResult>
</Preflight></Report>
I want to get all value/text in <Message> </Message>
element using lxml in Python.
Thanks
Upvotes: 0
Views: 314
Reputation: 14209
Easy from the lxml tuto:
>>> from lxml import etree
>>> s = """<Report>
<Preflight errors="0" criticalfailures="0" noncriticalfailures="0" signoffs="0" fixes="0" warnings="10">
<PreflightResult type="Check" level="warning">
<PreflightResultEntry xml:lang="en-US">
<Message>PDF/X-1a:20000 : PDF/X-1a:20000 output intent is missing </Message>
<StringContext>
<BaseString>PDF/X-1a:20000 : %PDFXVersion% output intent is missing</BaseString>
</StringContext>
</PreflightResultEntry>
</PreflightResult>
</Preflight></Report>
"""
>>> root = etree.XML(s)
>>> for message in root.findall('Preflight/PreflightResult/PreflightResultEntry/Message'):
print message.text
PDF/X-1a:20000 : PDF/X-1a:20000 output intent is missing
>>>
Upvotes: 2