Reputation: 1
I am writing a grammar for a small meta language. That language should include code blocks of another language (e.g., JavaScript, C, or the like). I would like to treat these code blocks just a plain strings that are print out unchanged. My language is C/Java syntax based using { }
for code blocks. But I would also like to use {
}
for the code blocks of the embedded language. Here some example code:
// my language
modul Abc {
input x: string;
otherLang {
// this is now a code block from the second
// language, which I do not want to analyze
// It might itself contain { } like
if (something) {
abc = "string";
}
}
}
How would I resuse {
and }
for those different uses without mixing them up with the ones from an embedded language?
Upvotes: 0
Views: 113
Reputation: 6001
An interesting way to do this is to use mode recursion. ANTLR internally maintains a mode stack.
Although a bit verbose, the recursed mode offers the possibility of handling things -- like comments and escaped chars -- that could otherwise throw off the nesting.
One thing to be aware of is that rules with more
attributes concatenate their matched content into the token produced by the first following non-more
ed rule. The following example uses the virtual token OTHER_END
to provide semantic clarity and preclude confusion with otherwise being a RPAREN
token.
tokens {
OTHER_END
}
otherLang : OTHER_BEG OTHER_END+ ; // multiple 'end's dependent on nesting
OTHER_BEG : 'otherLang' LPAREN -> pushMode(Other) ;
LPAREN : LParen ;
RPAREN : RParen ;
WS : [ \t\r\n] -> skip;
mode Other ;
// handle special cases here
O_RPAREN : RParen -> type(OTHER_END), popMode() ;
O_LPAREN : LParen -> more, pushMode(Other) ;
O_STUFF : . -> more ;
fragment LParen : '{' ;
fragment RParen : '}' ;
Upvotes: 1