Reputation: 1342
I am trying to write a parser using Boost Spirit which parses a scripting language of statements ended by a new line OR end of input. Therefore I wrote a custom skipper which skips blanks and one line comments (// bla bla) detects end of input and end of line but does not consume so that all expressions can be ended with an "eol" or "eoi" explicitely. Unfortunately, the program seems to parse either endlessly or does have an error for the last statement which is ended by eoi and not eol and I don't know how to split expressions properly:
*qi::eol >> (*(qi::char_ - (qi::eol)) % (+qi::eol)) >> qi::eoi
this one works except for the last statement which is ended by eoi which makes sense since it expects at least one eol for all lines. The following statement parses endlessly:
*qi::eol >> (*(qi::char_ - (qi::eol)) % (+qi::eol | &qi::eoi)) >> qi::eoi
It should not consume the eoi but accept it as a valid end of the statement. It still is excpected as last char.
the skipper has the following statement:
blank | lit("//") >> *(char_ - (eol | eoi) ) >> (&eol | &eoi)
In the end I want to get a list of strings each representing one line. For tests I use the following input:
// This is a comment in the first line 1234567890!"$?:;.-_+/=
globals
integer bla = 10 // This is a comment after an expression
endglobals
// This is a comment in the last line
type bla extends integer
I would expect 6 strings of which two are empty because there are only comments. For the statement
integer bla = 10 // This is a comment after an expression
it simply should cut the comment and the blanks so
integerbla=10
should be the resulting string.
If you have any better idea how to write such grammars please tell me!
Upvotes: 2
Views: 2151
Reputation: 1163
I think you are making it more complex than you need. The end of input will terminate the parsing without error, so if the source data is kosher then you shouldn't need to use eoi.
Try this skipper:
blank | lit("//") >> *(char_ - eol)
Try this for your expression parser:
*(char_ - eol) % eol
You'll have one expression per line and have to check for blank lines and discard them. To avoid that you could try something like:
*((+(char_ - eol) >> eol) | eol)
Upvotes: 1