David K.
David K.

Reputation: 6371

Error reporting and recovery in parser generators

I think parser generators are a pretty nice tool to have in your programming toolkit so after playing around with some I wrote my own just to understand things better and it turned out to be better than I expected so I've stuck with it.

One thing that has been bugging me lately though is error reporting and recovery. I don't do a very good job of it. I know one method is token synchronization but the trail seems to stop there. Other than rolling your own recursive descent parser and including all sorts of heuristics what are some general purpose ways of handling error reporting and error recovery in parser generators?

Upvotes: 1

Views: 1364

Answers (1)

Apalala
Apalala

Reputation: 9244

With PEG, which is top-down, you can implement the "cut" feature, either automatically, or for manual inclusion, so you can report errors as close to their source as possible. See Grako and the referenced article by Kota Mizushima. A "cut" invalidates alternatives after certain tokens are seen on the input, so the parser can know how to fail early.

In general, I don't like error recovery as the errors reported after the first tend to be nuisance, as Turbo Pascal once proved.

The general strategy for recovery is to perform rewrites, inserts or deletes, on the input sequence so the parser can continue. For a simple recovery strategy based solely on deletes (skipping input until an expected token), see section 5.9 of Wirth's A+D=P.

Upvotes: 4

Related Questions