Steven Klusener
Steven Klusener

Reputation: 159

Parsing comment in Rascal

I have a very basic question about parsing a fragment that contains comment. First we import my favorite language, Pico:

import lang::pico::\syntax::Main;

Then we execute the following:

 parse(#Id,"a");

gives, as expected:

 Id: (Id) `a`

However,

parse(#Id,"a\n%% some comment\n");

gives a parse error.

What do I do wrong here?

Upvotes: 2

Views: 353

Answers (1)

Davy Landman
Davy Landman

Reputation: 15438

There are multiple problems.

  1. Id is a lexical, meaning layout (comments) are never there
  2. Layout is only inserted between elements in a production and the Id lexical has only a character class, so no place to insert layout.
  3. Even if Id was a syntax non terminal with multiple elements, it would parse comments between them not before or after.

For more on the difference between syntax, lexical, and layout see: Rascal Syntax Definitions.

If you want to parse comments around a non terminal, we have the start modified for the non terminal. Normally, layout is only inserted between elements in the production, with start it is also inserted before and after it.

Example take this grammer:

layout L = [\t\ ]* !>> [\t\ ];
lexical AB = "A" "B"+;
syntax CD = "C" "D"+;
start syntax EF = "E" "F"+;

this will be transformed into this grammar:

AB   = "A" "B"+;
CD'  = "C" L "D"+;
EF'  = L "E" L "F"+ L;
"B"+ = "B"+ "B" | "B";
"D"+ = "D"+ L "D" | "D";
"F"+ = "F"+ L "F" | "F";

So, in particular if you'd want to parse a string with layout around it, you could write this:

lexical Id = [a-z]+;
start syntax P = Id i;
layout L = [\ \n\t]*;

parse(#start[P], "\naap\n").top // parses and returns the P node
parse(#start[P], "\naap\n").top.i // parses and returns the Id node
parse(P, "\naap"); // parse error at 0 because start wrapper is not around P

Upvotes: 1

Related Questions