ELLIOTTCABLE
ELLIOTTCABLE

Reputation: 18048

Which system should I use for parsing Scheme in OCaml?

I'm writing a compiler (badly) in OCaml, as a learning project; I'm at the point where using Jane Street's SexpLib isn't cutting it:

match str.[0] with
| '-' | '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ->
   compile_int str channel

Yeah, time for a real parser.

I'm most familiar with PEG (by which I really mean I know nothing about context-free grammars and such); but all the PEG/packrat parsers I see for OCaml seem to be suuuuper ancient and dead (There's Aurochs, whose last commit was 9 years ago, and whose landing-page now belongs to a French domain-squatter; there's “Teerex”, existing in a sub-directory of a dead language project with no documentation, which is alive and kicking … with commits as recent as only five years, woah! …).

Basically, I'd love advice from someone who's done some parsing work in OCaml in the last couple of years, and who knows the most idiomatic / modern approach to take. Thanks! (=

Upvotes: 0

Views: 532

Answers (1)

ivg
ivg

Reputation: 35210

We are using Janestreet Sexplib to implement our own Lisp dialect, and it works rather fine. We are currently in a process of moving to (still Janestreet's) Parsexp, it is much more flexible, and, most importantly, provides a nice way to annotate s-expressions with the location information using compact position sets. It's really neat! And helps in producing much nicer error messages. Parsexp also produces much nicer parsing errors and do not raise exceptions.

If you still don't want to use existing s-expressions parsers, then I would suggest you use mparser or Menhir. The former is interesting, since it allows you to implement context dependent grammars, thus you can embed macros directly into your parser and probably provide sane error messages. (You may notice, that I'm sort of fixed on the error messages). Menhir is of course much faster, and still, provide enough flexibility to build a robust parser with good diagnostics.

With that said, my personal opinion is that it is better to rely on Parsexp for translating text into the Sexp list, and then to perform all the translations on the sexp level.

Upvotes: 2

Related Questions