Reputation: 24566
So whilst reading the DOM level 2 spec found I came across the following bit of code
<!DOCTYPE ex SYSTEM "ex.dtd" [
<!ENTITY foo "foo">
<!ENTITY bar "bar">
<!ENTITY bar "bar2">
<!ENTITY % baz "baz">
]>
<ex/>
and whilst I understand why it's so broken when sticking it into html, why does it still display ]>
yet parse out the <ex/>
closing tag in the html?
Upvotes: 0
Views: 160
Reputation: 201788
It is not HTML at all. It’s a rather trivial piece of generic XML. Formally, “ex.dtd” refers to an external resource (such as another file) that is supposed to contain a Document Type Definition (DTDs).
When you throw generic XML at a browser serving it as HTML (e.g., with Content-Type: text/html specified in HTTP headers), funny things may happen. The browser tries to parse it as HTML.
In particular, browsers do not read DTDs, and they do not parse document type declarations (DOCTYPE declarations) by the formal specifications – they just recognize a limited set of specific doctype strings. They do not recognize the [...] thing, which is an XML (and SGML) construct containing an “internal subset” of a DTD, i.e. a way to augment an external DTD with additional declarations, like entity declarations here. They expect the doctype string to have ended when they see the “<” in the first ENTITY declaration, ignore those declarations, and then treat “]>” as character data.
Upvotes: 2