James Healy
James Healy

Reputation: 15168

How can I get an XSLT working when the source document has no DOCTYPE?

I have the following XML document:

<?xml version="1.0"  encoding="UTF-8"?>
<!DOCTYPE ONIXmessage SYSTEM "http://www.editeur.org/onix/2.1/short/onix-iternational.dtd">
<ONIXmessage release="2.1">
  <header>
    <m174>Some Publisher</m174>
    <m182>20090622</m182>
  </header>
  <product>
    <a001>160258186X</a001>
    <a002>03</a002>
    <productidentifier>
      <b221>15</b221>
      <b244>9781602581869</b244>
    </productidentifier>
    <b246>02</b246>
    <b012>BB</b012>
    <title>
      <b202>01</b202>
      <b203>The Acts of the Apostles</b203>
      <b030>The</b030>
      <b031>Acts of the Apostles</b031>
      <b029>Four Centuries of Baptist Interpretation</b029>
    </title>
  </product>
</ONIXmessage>

and the following xslt:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
    <xsl:variable name="target"><xsl:choose>
        <xsl:when test="/ONIXMessage">short</xsl:when>
        <xsl:otherwise>reference</xsl:otherwise>
    </xsl:choose></xsl:variable>
    <xsl:output method="xml" doctype-system="http://www.editeur.org/onix/2.1/reference/onix-international.dtd"/>
    <xsl:template match="*">
        <xsl:variable name="target-name">
            <xsl:choose>
                <xsl:when test="$target='short' and @shortname"><xsl:value-of select="@shortname"/></xsl:when>
                <xsl:when test="$target='reference' and @refname"><xsl:value-of select="@refname"/></xsl:when>
                <xsl:otherwise><xsl:value-of select="name()"/></xsl:otherwise>
            </xsl:choose>
        </xsl:variable>
        <xsl:element name="{$target-name}">
            <xsl:copy-of select="@*[not(name()='refname' or name()='shortname')]"/>
            <xsl:apply-templates select="*|text()"/>
        </xsl:element>
     </xsl:template>
     <xsl:template match="text()">
        <xsl:copy/>
    </xsl:template>
</xsl:stylesheet>

When I apply the XSLT, the output is perfect.

If I remove the DOCTYPE from the source document, then the xslt copies to source to the output with no changes. How can I get the XSLT to work even if the doctype is missing?

I am testing with the following commands

xsltproc stylesheet.xsl input.xml > output.xml

Upvotes: 2

Views: 1506

Answers (3)

snowangel
snowangel

Reputation: 3462

For this specific transformation, note that ONIX 2.1 is deprecated and that Editeur will not serve http://www.editeur.org/onix/2.1/reference/onix-international.dtd. You will have to store the DTD locally. Notes from the industry body Editeur here.

Upvotes: 3

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243529

You can easily add the DOCTYPE to the XML document in a pre-processing step like this:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"
 doctype-system=
 "http://www.editeur.org/onix/2.1/reference/onix-international.dtd"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied to an XML document without a DOCTYPE (in this case the provided XML document from which the DOCTYPE has been removed):

<ONIXmessage release="2.1">
    <header>
        <m174>Some Publisher</m174>
        <m182>20090622</m182>
    </header>
    <product>
        <a001>160258186X</a001>
        <a002>03</a002>
        <productidentifier>
            <b221>15</b221>
            <b244>9781602581869</b244>
        </productidentifier>
        <b246>02</b246>
        <b012>BB</b012>
        <title>
            <b202>01</b202>
            <b203>The Acts of the Apostles</b203>
            <b030>The</b030>
            <b031>Acts of the Apostles</b031>
            <b029>Four Centuries of Baptist Interpretation</b029>
        </title>
    </product>
</ONIXmessage>

the result is the same XML document, but with the DOCTYPE correctly added:

<!DOCTYPE ONIXmessage
  SYSTEM "http://www.editeur.org/onix/2.1/reference/onix-international.dtd">
<ONIXmessage release="2.1">
   <header>
      <m174>Some Publisher</m174>
      <m182>20090622</m182>
   </header>
   <product>
      <a001>160258186X</a001>
      <a002>03</a002>
      <productidentifier>
         <b221>15</b221>
         <b244>9781602581869</b244>
      </productidentifier>
      <b246>02</b246>
      <b012>BB</b012>
      <title>
         <b202>01</b202>
         <b203>The Acts of the Apostles</b203>
         <b030>The</b030>
         <b031>Acts of the Apostles</b031>
         <b029>Four Centuries of Baptist Interpretation</b029>
      </title>
   </product>
</ONIXmessage>

Now, you can successfully apply your transformation on the result of the preprocessing stage.

Upvotes: 1

Michael Kay
Michael Kay

Reputation: 163458

Since there is no @refname or @shortname in your input, copying the input to the output unchanged is exactly what this transformation appears to be trying to do. If it is intended to do something else, you will need to explain what that is. You haven't shown us the DTD, but there various ways it could affect the outcome; for example, perhaps it declares default values for the @refname or @shortname attributes. If that's the case, then since the stylesheet's behaviour depends on these attributes, there's no way it will work without them.

Upvotes: 3

Related Questions