Reputation: 3854
Can anyone explain to me why it is not possible to define an XML-like element using a context-free grammar (Chomsky, EBNF or syntax charts)?
Upvotes: 2
Views: 666
Reputation: 6173
Actually, XML is a context-free language that can be parsed with anything capable of parsing CFLs. CFLs are Chomsky Level 2.
It's actually already been done. W3 uses EBNF notation to "completely describe" (or define) XML:
symbol ::= expression
A subset of XML, known as the terminals (the "leaves" of the tree,) it is possible to parse with simple regular expressions. I'm not even talking about modern regexes (such as those found in Perl, PCRE, and even Java).
Symbols are written with an initial capital letter if they are the start symbol of a regular language, otherwise with an initial lowercase letter.
There's also a website that uses BNF to parse XML. (BNF is a little more confusing to read, especially when dealing with XML, because its syntax uses angle brackets, too.)
Upvotes: 1
Reputation: 262734
This thread says:
XML is a language defined by SGML, which is a restricted form of context free grammar (essentially a Dyck language with many types of parentesis)
Upvotes: 1