tehDorf
tehDorf

Reputation: 795

Referencing an XML doctype entity from XSLT

I'm trying to access the !ENTITY elements in an XML file's !DOCTYPE declaration when using an XSL file to transform the XML to HTML. In the XML, I have a widget element that has an attribute that corresponds to the !ENTITY name, and I want the XSLT to transform that into the !ENTITY's value.

XML File

<?xml version="1.0" encoding="utf-8"?>

<!DOCTYPE root[
  <!ENTITY Widget_Manual "Widget_Manual_File_Name.pdf" >
]>

<root>
  <!-- I want to convert this to "Widget_Manual_File_Name.pdf" in the transform -->
  <widget entityIdent="Widget_Manual" />
</root>

XSLT File

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

  <xsl:template match="/">
    <html>
      <head />
      <body>
        <xsl:apply-templates />
      </body>
    </html>
  </xsl:template>

  <xsl:template match="widget">
    <!-- Embed PDF -->
    <object width="800" height="600" type="application/pdf">
      <xsl:attribute name="data">
        <!-- How do I access the !ENTITY's value using the @entityIdent attribute? -->
        <xsl:value-of select="@entityIdent" />
      </xsl:attribute>
    </object>
  </xsl:template>

</xsl:stylesheet>

Actual Output

<object width="800" height="600" type="application/pdf" data="Widget_Manual"></object>

Desired Output

<object width="800" height="600" type="application/pdf" data="Widget_Manual_File_Name.pdf"></object>

Upvotes: 1

Views: 1072

Answers (1)

Mads Hansen
Mads Hansen

Reputation: 66723

It would be best to just use an entity in the XML, so that the XML parser would expand the entity for you when it parses the doc.

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE root[
  <!ENTITY Widget_Manual "Widget_Manual_File_Name.pdf" >
]>
<root>
    <!-- I want to convert this to "Widget_Manual_File_Name.pdf" in the transform -->
    <widget entityIdent="&Widget_Manual;" />
</root>

If you don't have control of the XML and want to perform the find/replace in the XSLT, then you will have to jump through some hoops. The XSLT operates on the XML that has already been parsed, and the DTD content is not directly addressable in the XML infoset.

However, you could obtain the base-uri() and then use unparsed-text() to read the XML as a string, and then use replace() with a regex that has a capture group (or substring functions) to obtain the value of the ENTITY.

The following XSLT 2.0 stylesheet could be used:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output indent="yes" />
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="@entityIdent">
        <xsl:attribute name="{name()}">
            <xsl:value-of select="replace(
                                   unparsed-text(base-uri()), 
                                   concat('.*!ENTITY ', ., ' &quot;(.+?)&quot;.*'),
                                   '$1', 
                                   's')"/>
        </xsl:attribute>
    </xsl:template>
</xsl:stylesheet>

Upvotes: 1

Related Questions