Reputation: 18059
Looking at definition of XML element contet and it's definition of CharData.
[43] content ::= CharData? ((element | Reference | CDSect | PI | Comment) CharData?)*
[14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*)
I noticed that this definition of CharData does not forbid having >
character inside XML element. I assumed this is error so I looked at the description of CharData (emphasis mine)
The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings "
&
" and "<
" respectively. The right angle bracket (>) may be represented using the string ">
", and MUST, for compatibility, be escaped using either ">
" or a character reference when it appears in the string "]]>" in content, when that string is not marking the end of a CDATA section.
So it seems that the [14] and the defintion of CharData are at odds. Is this assumption correct or do parsers allow for >
inside element without escaping it? Or do they automatically escape it?
Upvotes: 0
Views: 78
Reputation: 41127
The character >
is in fact allowed within xml without escaping, but the character sequence ]]>
is not.
You MAY escape any >
character as >
, but you MUST do so if it is part of the above sequence, i.e., the sequence ]]>
(or the equivalent with a character reference) is the correct way to represent that character sequence in xml when it's not used as the ending mark for a CDATA section.
Upvotes: 2