Duolasa
Duolasa

Reputation: 25

Why is it invalid to have "(" or ")" characters in an XML Element Name?

I'm currently having some problems with an application that generates XML in runtime and then tries to parse it elsewhere.

In some cases I'm getting an with the message "error parsing attribute name", this here is an example of a XML that fails:

<datastore>
   <row id="Timer?ID=0">
      <ID>0</ID>
      <START_TIME_(sec)>120</START_TIME_(sec)>
   </row>
</datastore>

The parser seems to fail as soon as it tries read the ( character, this happens with other characters like ) and ?.

I thought that the only invalid characters in XML where the ones specified in this answer: https://stackoverflow.com/a/1091953

Any idea why this could be failing?

Upvotes: 0

Views: 5186

Answers (2)

IMSoP
IMSoP

Reputation: 97968

The answer you found lists the characters reserved in the text of an XML document, i.e. the contents of elements and the values of attributes. However, your example uses punctuation within the name of an element, which is subject to stricter limits.

The full list of allowed characters can be found in the XML specification; note that the first character of the name is even further restricted. (XML 1.1 expands the list of allowed characters slightly to reflect evolution of the Unicode standard.) The main thing to notice is that most of the common punctuation from ASCII (which would have Unicode code points below #x7f) are excluded.

It is common practice to use only names which begin with a letter, and proceed with letters, digits, underscores and hyphens, but a well-written XML parser should handle a wider range of Unicode characters should you wish to use them.

Names beginning with "xml" (in any combination of upper and lower case) are specially reserved, and names containing colons will be interpreted as using namespaces, so those should also be avoided.

Note that there is no escape mechanism for these restricted characters, you just have to design your format not to need them.

Upvotes: 3

Jonathan
Jonathan

Reputation: 1296

These are characters to be encoded in element's text, but there is a naming convention for xml element names.

XML elements must follow these naming rules:

  • Element names are case-sensitive
  • Element names must start with a letter or underscore
  • Element names cannot start with the letters xml (or XML, or Xml, etc)
  • Element names can contain letters, digits, hyphens, underscores, and periods
  • Element names cannot contain spaces

    Any name can be used, no words are reserved (except xml).

(source: http://www.w3schools.com/xml/xml_elements.asp)

It means your parentheses are not valid in element name

Upvotes: 0

Related Questions