Reputation: 3153
I am trying to use parser combinators in Scala on a simple grammar that I have copied from a book. When I run the following code it stops immediately after the first token has been parsed with the error
[1.3] failure: string matching regex '\z' expected but '+' found
I can see why things go wrong. The first token is an expression and therefor it is the only thing that needs to be parsed according to the grammar. However I do not know what is a good way to fix it.
object SimpleParser extends RegexParsers
{
def Name = """[a-zA-Z]+""".r
def Int = """[0-9]+""".r
def Main:Parser[Any] = Expr
def Expr:Parser[Any] =
(
Term
| Term <~ "+" ~> Expr
| Term <~ "-" ~> Expr
)
def Term:Parser[Any] =
(
Factor
| Factor <~ "*" ~> Term
)
def Factor:Parser[Any] =
(
Name
| Int
| "-" ~> Int
| "(" ~> Expr <~ ")"
| "let" ~> Name <~ "=" ~> Expr <~ "in" ~> Expr <~ "end"
)
def main(args: Array[String])
{
var input = "2 + 2"
println(input)
println(parseAll(Main, input))
}
}
Upvotes: 0
Views: 217
Reputation: 38045
Factor <~ "*" ~> Term
means Factor.<~("*" ~> Term)
, so the whole right part is dropped.
Use Factor ~ "*" ~ Term ^^ { case f ~ _ ~ t => ??? }
or rep1sep
:
scala> :paste
// Entering paste mode (ctrl-D to finish)
import scala.util.parsing.combinator.RegexParsers
object SimpleParser extends RegexParsers
{
def Name = """[a-zA-Z]+""".r
def Int = """[0-9]+""".r
def Main:Parser[Any] = Expr
def Expr:Parser[Any] = rep1sep(Term, "+" | "-")
def Term:Parser[Any] = rep1sep(Factor, "*")
def Factor:Parser[Any] =
(
"let" ~> Name ~ "=" ~ Expr ~ "in" ~ Expr <~ "end" ^^ { case n ~ _ ~ e1 ~ _ ~ e2 => (n, e1, e2)
| Int
| "-" ~> Int
| "(" ~> Expr <~ ")"
| Name }
)
}
SimpleParser.parseAll(SimpleParser.Main, "2 + 2")
// Exiting paste mode, now interpreting.
import scala.util.parsing.combinator.RegexParsers
defined module SimpleParser
res1: SimpleParser.ParseResult[Any] = [1.6] parsed: List(List(2), List(2))
The second part of parser def Term:Parser[Any] = Factor | Factor <~ "*" ~> Term
is useless. The first part, Factor
, can parse (with non-empty next
) any Input
that the second part, Factor <~ "*" ~> Term
, is able to parse.
Upvotes: 1