Reputation: 54300
I'm reading the XML specification at W3C, and this part of the section on attribute value normalization caught my attention:
If the attribute type is not CDATA, then the XML processor MUST further process the normalized attribute value by discarding any leading and trailing space (#x20) characters, and by replacing sequences of space (#x20) characters by a single space (#x20) character.
Does this mean that
<tag attr=" a b " />
is equivalent to
<tag attr="a b" />
Or am I misinterpreting what the specification says?
Upvotes: 3
Views: 1895
Reputation: 52888
Here's an example to supplement the correct answer by @Per Norrman (+1) and the example you used in your question.
<!DOCTYPE tag [
<!ELEMENT tag EMPTY>
<!ATTLIST tag
attr NMTOKENS #IMPLIED>
]>
<tag attr=" a b "/>
is equivalent to
<!DOCTYPE tag [
<!ELEMENT tag EMPTY>
<!ATTLIST tag
attr NMTOKENS #IMPLIED>
]>
<tag attr="a b"/>
because the attribute type of attr
is NMTOKENS
(plural).
However the following would not be equivalent to the NMTOKEN
example because attr
is literal text (CDATA = character data):
<!DOCTYPE tag [
<!ELEMENT tag EMPTY>
<!ATTLIST tag
attr CDATA #IMPLIED>
]>
<tag attr=" a b "/>
This is because the attribute type of attr
is CDATA.
Upvotes: 2
Reputation: 12817
Your interpretation is correct, given that the 'attr' type is not CDATA, but most probably it is.
The annotated XML specification helped me a lot when scrutinizing the details: http://www.xml.com/axml/testaxml.htm
Upvotes: 4