Vasyl  Demianov
Vasyl Demianov

Reputation: 307

Language that W3C XML Recommendation uses to present definitions

I am trying to read W3C recommendation for XML, and I found myself a bit puzzled by the language used to define things, the one that uses ::= notation.

Most of the time those definitions look like regular expressions:

STag       ::=      '<' Name (S Attribute)* S? '>'

But from time to time I come across strange notation, like the following:

Comment    ::=      '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'

What does Char - '-' mean? Match anything that Char matches excluding '-'?

Where can I find formal definition of that language? I tried to search via "::=" but Google just ignores it. The W3C recommendation itself doesn't have any information on the matter.

Upvotes: 1

Views: 68

Answers (2)

Michael Kay
Michael Kay

Reputation: 163468

It's one of very many variants of BNF (Backus Naur Form) - which as you point out has similarities to regular expressions.

The "except" operator ("-") is a little unusual, in my experience. (Char - '-') means "Anything that matches Char and does not match '-'" - that is, any character except a hyphen.

The particular flavour of BNF that the XML specification uses is described in section 6 of the spec:

https://www.w3.org/TR/REC-xml/#sec-notation

Upvotes: 3

wero
wero

Reputation: 33000

From the XML recommendation:

The formal grammar of XML is given in this specification using a simple Extended Backus-Naur Form (EBNF) notation.

and explains:

'string' matches a literal string matching that given inside the single quotes.

Upvotes: 0

Related Questions