Reputation: 318
I have made the following parser to try to parse BNF:
type Literal = Literal of string
type RuleName = RuleName of string
type Term = Literal of Literal
| RuleName of RuleName
type List = List of Term list
type Expression = Expression of List list
type Rule = Rule of RuleName * Expression
type BNF = Syntax of Rule list
let pBFN : Parser<BNF, unit> =
let pWS = skipMany (pchar ' ')
let pLineEnd = skipMany1 (pchar ' ' >>. newline)
let pLiteral =
let pL c = between (pchar c) (pchar c) (manySatisfy (isNoneOf ("\n" + string c)))
(pL '"') <|> (pL '\'') |>> Literal.Literal
let pRuleName = between (pchar '<') (pchar '>') (manySatisfy (isNoneOf "\n<>")) |>> RuleName.RuleName
let pTerm = (pLiteral |>> Term.Literal) <|> (pRuleName |>> Term.RuleName)
let pList = sepBy1 pTerm pWS |>> List.List
let pExpression = sepBy1 pList (pWS >>. (pchar '|') .>> pWS) |>> Expression.Expression
let pRule = pWS >>. pRuleName .>> pWS .>> pstring "::=" .>> pWS .>>. pExpression .>> pLineEnd |>> Rule.Rule
many1 pRule |>> BNF.Syntax
For testing, I'm running it on BNF's BNF as per Wikipedia:
<syntax> ::= <rule> | <rule> <syntax>
<rule> ::= <opt-whitespace> "<" <rule-name> ">" <opt-whitespace> "::=" <opt-whitespace> <expression> <line-end>
<opt-whitespace> ::= " " <opt-whitespace> | ""
<expression> ::= <list> | <list> <opt-whitespace> "|" <opt-whitespace> <expression>
<line-end> ::= <opt-whitespace> <EOL> | <line-end> <line-end>
<list> ::= <term> | <term> <opt-whitespace> <list>
<term> ::= <literal> | "<" <rule-name> ">"
<literal> ::= '"' <text> '"' | "'" <text> "'"
But it always fails with this error:
Error in Ln: 1 Col: 21
<syntax> ::= <rule> | <rule> <syntax>
^
Expecting: ' ', '"', '\'' or '<'
What am I doing wrong?
Edit
The function I'm using to test:
let test =
let text = "<syntax> ::= <rule> | <rule> <syntax>
<rule> ::= <opt-whitespace> \"<\" <rule-name> \">\" <opt-whitespace> \"::=\" <opt-whitespace> <expression> <line-end>
<opt-whitespace> ::= \" \" <opt-whitespace> | \"\"
<expression> ::= <list> | <list> <opt-whitespace> \"|\" <opt-whitespace> <expression>
<line-end> ::= <opt-whitespace> <EOL> | <line-end> <line-end>
<list> ::= <term> | <term> <opt-whitespace> <list>
<term> ::= <literal> | \"<\" <rule-name> \">\"
<literal> ::= '\"' <text> '\"' | \"'\" <text> \"'\""
run pBNF text
Upvotes: 1
Views: 199
Reputation: 55195
Your first problem is with pList
: sepBy1
is greedily grabbing trailing spaces, but once it does that it then expects an additional term to follow rather than the end of the list. The simplest way to fix this is to use sepEndBy1
instead.
This will expose your next problem: pEndLine
isn't faithfully implemented because you're always looking for exactly one space followed by a newline, when you should be looking for any number of spaces instead (that is, you want pWS >>. newline
in the interior, rather than pchar ' ' >>. newline
).
Finally, note that your definition requires each rule to end with a newline, so you won't be able to parse your string as given (you'll need to append an empty line to the end). Instead you might want to pull newline
out of your definition of pRule
and define the main parser as sepBy1 pRule pLineEnd |>> BNF.Syntax
.
Upvotes: 4