William
William

Reputation: 5

Parsing XML Processing Instruction

I'm trying to parse the xmlts20130923/xmlconf/xmltest/valid/sa/017a.xml file from the XML W3C Conformance Test Suite 20130923:

<!DOCTYPE doc [
<!ELEMENT doc (#PCDATA)>
]>
<doc><?pi some data ? > <??></doc>

Processing Instructions Definition

I think the the parsed processing instruction should be data: "some data ? > <?" because the "first" processing instruction isn't closed due to the whitespace. Is this a correct assumption or are there two processing instructions of which the second would have no target and no data?

Upvotes: 0

Views: 40

Answers (1)

LMC
LMC

Reputation: 12822

OP's assumtion is correct. PI content is some data ? > <?

from lxml import etree
tree = etree.parse("tmp.xml")
pi = tree.xpath('//processing-instruction()')
pi[0].text
'some data ? > <?'

Upvotes: 0

Related Questions