Rob Wijkstra
Rob Wijkstra

Reputation: 811

Is an '!'-symbol in an XML attribute name allowed?

I'm dealing with incoming XML-messages from an external party, so they're not authored by myself. I believe they aren't using valid XML-syntax.

Consider the following XML:

<?xml version="1.0"?>
<MyItems>
    <MyItem !myAttributeName=""/> 
    <!--    ^ Note the '!'-mark at the beginning of 'myAttributeName'.-->
</MyItems>

I'm surprised to encounter an exclamation mark at the beginning of an XML-attribute. Is this blatantly invalid XML? Or am I missing something here? Basically all XML-parsers and validation tools are incapable of dealing with this attribute name, but I'm still surprised that this is being used in the first place.

Upvotes: 1

Views: 621

Answers (2)

Jon Hanna
Jon Hanna

Reputation: 113372

Is this blatantly invalid XML?

Yes.

Valid names are composed thus:

NameStartChar   ::=         ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] |
                            [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] |
                            [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] |
                            [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]

NameChar        ::=         NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] |
                            [#x203F-#x2040]

Name            ::=         NameStartChar (NameChar)*

! is #x21 so not only is it not allowed at the start of an XML name, it's not allowed anywhere within one, either. In XML terms, what you have there is gibberish.

Upvotes: 2

DMC19
DMC19

Reputation: 947

All symbols that differ from _ or : at in an attribute name are unexpected tokens for all XML validation tools.

As I could see here, the syntax that XML allows is :

Extender ::= #x00B7 | #x02D0 | #x02D1 | #x0387 | #x0640 | #x0E46
| #x0EC6 | #x3005 | [#x3031-#x3035] | [#x309D-#x309E] | [#x30FC-#x30FE]

CombiningChar ::= [#x0300-#x0345] | [#x0360-#x0361] | [#x0483-#x0486] | [#x0591-#x05A1] | [#x05A3-#x05B9] | [#x05BB-#x05BD] | #x05BF | [#x05C1-#x05C2] | #x05C4 | [#x064B-#x0652] | #x0670 | [#x06D6-#x06DC] | [#x06DD-#x06DF] | [#x06E0-#x06E4] | [#x06E7-#x06E8] | [#x06EA-#x06ED] | [#x0901-#x0903] | #x093C | [#x093E-#x094C] | #x094D | [#x0951-#x0954] | [#x0962-#x0963] | [#x0981-#x0983] | #x09BC | #x09BE | #x09BF | [#x09C0-#x09C4] | [#x09C7-#x09C8] | [#x09CB-#x09CD] | #x09D7 | [#x09E2-#x09E3] | #x0A02 | #x0A3C | #x0A3E | #x0A3F | [#x0A40-#x0A42] | [#x0A47-#x0A48] | [#x0A4B-#x0A4D] | [#x0A70-#x0A71] | [#x0A81-#x0A83] | #x0ABC | [#x0ABE-#x0AC5] | [#x0AC7-#x0AC9] | [#x0ACB-#x0ACD] | [#x0B01-#x0B03] | #x0B3C | [#x0B3E-#x0B43] | [#x0B47-#x0B48] | [#x0B4B-#x0B4D] | [#x0B56-#x0B57] | [#x0B82-#x0B83] | [#x0BBE-#x0BC2] | [#x0BC6-#x0BC8] | [#x0BCA-#x0BCD] | #x0BD7 | [#x0C01-#x0C03] | [#x0C3E-#x0C44] | [#x0C46-#x0C48] | [#x0C4A-#x0C4D] | [#x0C55-#x0C56] | [#x0C82-#x0C83] | [#x0CBE-#x0CC4] | [#x0CC6-#x0CC8] | [#x0CCA-#x0CCD] | [#x0CD5-#x0CD6] | [#x0D02-#x0D03] | [#x0D3E-#x0D43] | [#x0D46-#x0D48] | [#x0D4A-#x0D4D] | #x0D57 | #x0E31 | [#x0E34-#x0E3A] | [#x0E47-#x0E4E] | #x0EB1 | [#x0EB4-#x0EB9] | [#x0EBB-#x0EBC] | [#x0EC8-#x0ECD] | [#x0F18-#x0F19] | #x0F35 | #x0F37 | #x0F39 | #x0F3E | #x0F3F | [#x0F71-#x0F84] | [#x0F86-#x0F8B] | [#x0F90-#x0F95] | #x0F97 | [#x0F99-#x0FAD] | [#x0FB1-#x0FB7] | #x0FB9 | [#x20D0-#x20DC] | #x20E1 | [#x302A-#x302F] | #x3099 | #x309A

NameChar ::=  Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender

Name ::=  (Letter | '_' | ':') (NameChar)*

Attribute ::= Name Eq AttValue

AttValue ::= '"' ([^<&"] | Reference)* '"' | "'" ([^<&'] | Reference)* "'"

These are all EBNF that XML allows, and that is why an attribute with ! symbol in name isn't well-formed.

Upvotes: 1

Related Questions