Reputation: 1554
I am working with the Java15 grammar and have a couple questions about how Rascal's parser works and why some things aren't working. Given a concrete syntax:
module tests::Concrete
start syntax CompilationUnit =
compilationUnit: TypeDec* LAYOUTLIST
;
syntax TypeDec =
ClassDec
;
syntax ClassDec =
\class: ClassDecHead ClassBody
;
syntax ClassDecHead =
"class" Id
;
syntax ClassBody =
"{" ClassBodyDec* "}"
;
syntax ClassBodyDec =
ClassMemberDec
;
syntax ClassMemberDec =
MethodDec
;
syntax MethodDec =
\method: MethodDecHead
;
syntax MethodDecHead =
ResultType Id
;
syntax ResultType =
\void: "void"
;
syntax Id =
\id: [A-Z_a-z] !<< ID \ IDKeywords !>> [0-9A-Z_a-z]
;
keyword Keyword =
"void"
;
keyword IDKeywords =
"null"
| Keyword
;
lexical LAYOUT =
[\t-\n \a0C-\a0D \ ]
;
lexical ID =
[A-Z_a-z] [0-9A-Z_a-z]*
;
layout LAYOUTLIST =
LAYOUT* !>> [\t-\n \a0C-\a0D \ ] !>> ( [/] [*] ) !>> ( [/] [/] ) !>> "/*" !>> "//"
;
an AST definition:
module tests::Abstract
data Declaration =
\compilationUnit(list[Declaration] body)
| \package(ID name)
| \import(ID name)
| \class(ID name, list[Declaration] body)
| \method(Type ret, ID name)
;
data Type =
\void()
;
data ID =
\id(str id)
;
and a driver to load files:
module tests::Load
import Prelude;
import tests::Concrete;
import tests::Abstract;
public Declaration load(loc l) = implode(#Declaration, parse(#CompilationUnit, l));
I'm finding some oddities in what is actually working and what isn't. If I take the program:
class A {
}
This parses as expected into: compilationUnit([ class(id("A"),[]) ])
But parsing and constructing AST nodes for methods inside of the class is proving to be a bit hairy. Given the program:
class A {
void f
}
this produces a "Cannot find a constructor for Declaration"
error. If I modify the syntax to be:
syntax MethodDecHead =
ResultType
;
The AST to be:
| \method(Type ret)
I'm able to get the tree I would expect: compilationUnit([class(id("A"),[method(void())])])
I'm having a lot of confusion about what's going on here, how keywords are handled and what's causing this behaviour.
In addition to this if I don't add the LAYOUTLIST
to the end of the start syntax
production I get a ParseError
anytime I try to read from a file.
Upvotes: 2
Views: 247
Reputation: 111
The production rule of ClassDec
is not compatible with the AST node class
.
Changing it to:
syntax ClassDec =
\class: "class" Id "{" ClassBodyDec* "}"
;
Makes it more regular and isomorphic with the AST node class(ID name, list[Declaration])
However: the names should always correspond, so I'd suggest changing ID
to Id
in the grammar. Further, your AST node expects Declaration
s, but in the grammar you have ClassBodyDec
s.
The general rules for implode
are:
lexical Id = id: [a-z]+
, can map to data Id = id(str x)
;implode
"looks over them": so if I had syntax A = B; syntax B = cons: "bla"
, then I can use the ADT: data A = cons()
.(These rules are documented in Parsetree.rsc, https://github.com/cwi-swat/rascal/blob/master/src/org/rascalmpl/library/ParseTree.rsc)
Upvotes: 2
Reputation: 6696
I'm not the expert on implode
so I leave that for now, but the LAYOUTLIST thing is due to the way parse
is called.
Every start
non-terminal defined by start Something =
produces two types, namely:
* the non-terminal itself Something
and
* a wrapper non-terminal named start[Something]
.
The wrapper is automatically/implicitly defined by this:
syntax start[Something] = LAYOUTLIST before Something top LAYOUTLIST after;
So, if you want to have whitespace and comments before and after your program you call parse like so:
parse(#start[Something], yourLocation)
And if you are not interested in keeping the comments or whitespace for later, then you could project out the top tree like so:
Something mySomething = parse(#start[Something], myLocation).top;
Upvotes: 1