Reputation: 795
I'm trying to access the !ENTITY
elements in an XML file's !DOCTYPE
declaration when using an XSL file to transform the XML to HTML. In the XML, I have a widget
element that has an attribute that corresponds to the !ENTITY
name, and I want the XSLT to transform that into the !ENTITY
's value.
XML File
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE root[
<!ENTITY Widget_Manual "Widget_Manual_File_Name.pdf" >
]>
<root>
<!-- I want to convert this to "Widget_Manual_File_Name.pdf" in the transform -->
<widget entityIdent="Widget_Manual" />
</root>
XSLT File
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<html>
<head />
<body>
<xsl:apply-templates />
</body>
</html>
</xsl:template>
<xsl:template match="widget">
<!-- Embed PDF -->
<object width="800" height="600" type="application/pdf">
<xsl:attribute name="data">
<!-- How do I access the !ENTITY's value using the @entityIdent attribute? -->
<xsl:value-of select="@entityIdent" />
</xsl:attribute>
</object>
</xsl:template>
</xsl:stylesheet>
Actual Output
<object width="800" height="600" type="application/pdf" data="Widget_Manual"></object>
Desired Output
<object width="800" height="600" type="application/pdf" data="Widget_Manual_File_Name.pdf"></object>
Upvotes: 1
Views: 1072
Reputation: 66723
It would be best to just use an entity in the XML, so that the XML parser would expand the entity for you when it parses the doc.
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE root[
<!ENTITY Widget_Manual "Widget_Manual_File_Name.pdf" >
]>
<root>
<!-- I want to convert this to "Widget_Manual_File_Name.pdf" in the transform -->
<widget entityIdent="&Widget_Manual;" />
</root>
If you don't have control of the XML and want to perform the find/replace in the XSLT, then you will have to jump through some hoops. The XSLT operates on the XML that has already been parsed, and the DTD content is not directly addressable in the XML infoset.
However, you could obtain the base-uri()
and then use unparsed-text()
to read the XML as a string, and then use replace()
with a regex that has a capture group (or substring functions) to obtain the value of the ENTITY.
The following XSLT 2.0 stylesheet could be used:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes" />
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="@entityIdent">
<xsl:attribute name="{name()}">
<xsl:value-of select="replace(
unparsed-text(base-uri()),
concat('.*!ENTITY ', ., ' "(.+?)".*'),
'$1',
's')"/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Upvotes: 1