Bogs
Bogs

Reputation: 31

XSL/XML: HOw to put html tags in an xml doc so they render

I try replacing the tags like this:

<node><br></node> -- >  <node>&lt;br&gt;</node>

unfortunately when the xsl parses the xml file i actually get

<br>

displayed on the page instead of having it displayed as markup.

Upvotes: 3

Views: 5164

Answers (4)

Renzo Ciot
Renzo Ciot

Reputation: 3846

If you want to insert a non well-formed html, this is a possible work-around. Put your not well-formed html in a comment inside the xml, then extract it from xsl.

example of XML:

<Data>
  <!--
  <div>
    not well-formed xml<br>
  </div>
 -->
</Data>

example of XSL:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
  <xsl:template match="Data">
    <html>
      <body>
        <xsl:value-of disable-output-escaping="yes" select="comment()"/>
      </body>
    </html>
  </xsl:template>
  <xsl:template match="text() | @*">
  </xsl:template>
</xsl:stylesheet>

output

<html>
  <body>
    <div>
     not well-formed xml<br>
    </div>
  </body>
</html>

Upvotes: 1

Flynn1179
Flynn1179

Reputation: 12075

HTML isn't XML, although they do look very similar; there's four things that are valid in HTML that you can't do with XML, all of which can be modified to be XML compliant:

  • Unclosed tags, as you discovered. Just replace these with a closed version- <br> to <br/> etc.
  • Attributes without values, such as in <input type="checkbox" checked>. Just assign them a value with the same name as the attribute, i.e. <input type="checkbox" checked="checked" />.
  • Mismatched tags- these are a little trickier. For example, it's legal in HTML to do <b>A<i>B</b>C</i>, which would make A bold, C italic, and B both bold and italic. You can make this XML compliant by doing <b>A<i>B</i></b><i>C</i> or <b>A</b><i><b>B</b>C</i>.
  • Most entities. Only &lt;, &gt;, &amp;, &quot;, &apos; and unicode values (e.g. &#160;/&#xA0;) are valid entities in XML. You can't use &nbsp; or &oslash; or anything like that by default. To fix this, you need to include an entity declaration at the top of the sheet, such as <!ENTITY nbsp "&#160;">.

XSLT is incapable of processing an HTML file unless it's also valid XML.

As a rule, I always write HTML to be XML compliant simply because it makes the whole range of XML tools available, and there's really no reason not to.

Replacing <br> with &lt;br&gt; actually replaces the tag with TEXT that happens to resemble html, not an xml compliant tag.

Upvotes: 1

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243459

The text you provided:

<node><br></node>

is not well-formed XML document and is not suited for processing with XSLT 1.0.

In case you have:

<node><br/></node>

then simply output the <br/> element "as-is" -- then it is a valid markup.

Example:

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="nodes">
  <html>
   <xsl:apply-templates/>
  </html>
 </xsl:template>

 <xsl:template match="node">
  <p>
   <xsl:apply-templates/>
  </p>
 </xsl:template>
</xsl:stylesheet>

when applied on this XML document:

<nodes>
 <node>
 1 <br/>
 2 <br/>
 3 <br/>
 </node>
</nodes>

produces:

<html>
   <p>
      1 <br>
      2 <br>
      3 <br></p>
</html>

and this is displayed by the browser as:

1
2
3

Upvotes: 0

bmargulies
bmargulies

Reputation: 99993

Leave them as <br/> and write the appropriate XSLT transform to map them to the output as-is.

Upvotes: 0

Related Questions