plep
plep

Reputation:

Is there any use for putting a string that contains enters (char 10 or 13) in a xml CDATA section?

I'm currently working on some old code that has the following construct.

Document doc = org.w3c.Document
Element root = doc.getDocumentElement();

if ( string contains \n or \r )
then
  root.appendChild(doc.createCDATASection(string))
else
  root.appendChild(doc.createTextNode(string))
endif  

I can not think of any usage that would need to put a string a CDATA section when it contains an "\n" or an "\r". I believe using createTextNode will not cause any trimming or removal of newlines in the text in case string is like "mytext\n\n\n" when you either set it or retrieve the value.

Can somebody think of a valid/usefull case where you would want to put such a string in a CDATA section?

Upvotes: 0

Views: 279

Answers (6)

Robert Rossney
Robert Rossney

Reputation: 96870

Since CDATA sections allow you to put arbitrary data inside an XML document without having to understand anything about how the XML objects are going to handle it, they're frequently used by people who don't understand how the XML objects work. Generally speaking, when I see someone creating CDATA in their XML I start from the assumption that they don't really know what they're doing unless they've included a good explanation. (And more often than not, that good explanation reveals that they didn't know what they were doing.)

The original developer is probably confusing the DOM's handling of text nodes that contain whitespace with its handling of text nodes that contain only whitespace. DOMs frequently normalize whitespace-only text nodes, which can be a problem in XML like:

<xsl:value-of select="foo"/>
<xsl:text>    </xsl:text>
<xsl:value-of select="bar"/>

If the DOM normalizes the four spaces in that second element down to one space, that changes the functionality of that transform, which is an unambiguously bad thing.

But there's a reason you don't see XSLT that looks like this:

<xsl:value-of select="foo"/>
<xsl:text><![CDATA[    ]]>/xsl:text>
<xsl:value-of select="bar"/>

And that's that XSLT processors are written by people who understand how the XML objects work, and who know that in their specific case, it's important to tell the DOM to preserve whitespace in whitespace-only text nodes.

Upvotes: 0

brabster
brabster

Reputation: 43590

Putting text inside a CDATA section should ensure that any parser ignores it, so the code above might be used to ensure correct formatting regardless what a parser is told to do with whitespace.

I supposed that it effectively says that the line breaks are meaningful in that section, and not just incidental. Not sure why you would only put a CDATA section in if there were linebreaks present though, so I would guess it's just a workaround rather than a by-design thing in the code given.

Upvotes: 0

Chris S
Chris S

Reputation: 65456

I would say it depends entirely on whether your XML parse strips whitespace and control characters. I'm fairly certain the System.Xml ones in .NET don't, nor MSXML or Xerces but there are options to do it.

Upvotes: 0

anon
anon

Reputation:

In XML, CDATA preserves whitespace, ordinary text does not.

Upvotes: 1

Elijah
Elijah

Reputation: 13604

I know it sounds obvious, but if you are embedding a plain ascii text file and you want to preserve the manual formatting of the file verbatim. That would be a useful case.

Other cases that I have encountered are outputting metadata from images and I have no control over their formatting.

Upvotes: 1

Jordan S. Jones
Jordan S. Jones

Reputation: 13903

I could be way off base on this, but I seem to remember it being a good recommendation to put Javascript code inside CDATA tags. In fact see the selected answer for this stack overflow question as it does a decent job on answering why: When is a CDATA section necessary within a script tag?

Upvotes: 0

Related Questions