Reputation: 25591
Handling CDATA with lxml involves making parser with suitable declaration, but how about XSLT? For example:
from lxml import etree
parser = etree.XMLParser(strip_cdata=False)
tree = etree.parse('sample_with_cdata.xml', parser)
transform = etree.XSLT(etree.parse('dupe.xsl'))
xml_out = transform(tree)
xml_out.write('processed.xml')
If I process xml file with CDATA through lxml XSLT processor, all CDATA is stripped. How can I tell XSLT processor to leave CDATA as is?
PS. FYI, adding same parser to etree.XSLT
doesn't change outcome
Upvotes: 0
Views: 839
Reputation: 163262
As far as XSLT is concerned, CDATA sections in XML are just noise. XSLT treats <![CDATA["]]>
the same as "
which it treats the same as "
; they are different ways for the document author to write the same thing.
If you are using CDATA sections in your input to convey information, that is if <![CDATA[xxx]]>
means something different from xxx
, then you need to change your XML design.
Upvotes: 1
Reputation: 25591
This doesn't seem to be related to lxml. It's my lack of knowledge...
CDATA in XSLT should be handled with "cdata-section-elements" attribute in output declaration. For example, if description element in XML file contains CDATA:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" cdata-section-elements='description' />
...
Upvotes: 1