Pratik
Pratik

Reputation: 23

Is a newline character invalid in XML?

Based on what I found in the XML specification, following 3 characters are the only ones that are illegal:

  1. &
  2. <
  3. >

We are working with a vendor on a tool which does not seem to be able to process a newline character.

e.g.

<Comments> This is line 1
This is line 2
</Comments>

will produce an error in the tool and the root cause I am being given is that newline character is not allowed in XML. The specification does not clearly say anything about this.

I am trying to understand if newline is indeed an invalid character in XML or if this could be the limitation of the tool.

Upvotes: 2

Views: 9236

Answers (2)

kjhughes
kjhughes

Reputation: 111561

Presumably you mean to ask about well-formed, not valid, XML. (See Well-formed vs Valid XML for details on the difference.)

Newline characters are most certainly allowed in well-formed XML.

  • &#13; (#xA) is CR
  • &#10; (#xD) is LF

(Windows usually end lines with CR+LF; MacOS X and Linux, LF; classic Mac OS, CR.)

The XML Recommendation does indeed clearly allow both. See Character Range:

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

Common Usage

Within an element, new lines are typically significant to an application:

<a>one
two</a>

usually means something different than

<a>one two</a>

Between markup, new lines typically are insignificant:

<a>
   <b>one</b>
</a>

usually means the same as

<a><b>one</b></a>

Other Characters

Finally, you're painting in somewhat sloppy strokes in saying that &, <, and > are illegal. Instead, use the following guidelines:

  • &: must use &amp; if not a part of an entity reference.
  • <: must use &lt; if not a part of a tag, comment, PI, etc.
  • >: must use &gt; if part of the string ]]>.
  • ': must use apos; if within attribute values delimited by '.
  • ": must use quot; if within attribute values delimited by ".

See also

Upvotes: 4

Quentin
Quentin

Reputation: 943564

No. New line characters are perfectly acceptable in XML. They are included in the character range for text.

Upvotes: 0

Related Questions