rperz86
rperz86

Reputation: 135

Syntax Highlighting when using special characters

I'm currently finishing up a mathematical DSL based on LaTeX code in Rascal. This means that I have a lot of special characters ({,},), for instance in the syntax shown below, the sum doesn't get highlighted unless I remove the \ and _{ from the syntax.

syntax Expression = left sum: '\\sum_{' Assignment a '}^{' Expression until '}' Expression e 

I've noticed that keywords that contain either \ or { and } do not get highlighted. Is there a way to overcome this?

Edit: I accidentally used data instead of syntax in this example

Upvotes: 2

Views: 161

Answers (1)

Jurgen Vinju
Jurgen Vinju

Reputation: 6696

There are at least two solutions, one is based on changing the grammar, one is based on a post-parse tree traversal. Pick your poison :-)

The cause of the behavior is the default highlighting rules which heuristically detect what a "keyword" to be highlighted is by matching any literal with the regular expression [A-Za-z][A-Za-z0-9\-]*. Next to these heuristic defaults, the highlighting is fully programmable via @category tags in the grammar and @category annotations in the parse tree.

If you change the grammar like so, you can influence highlighting via tags:

data Expression = left sum: SumKw Assignment a '}^{' Expression until '}' Expression e
data SymKw = @category="MetaKeyword" '\\sum_{';

Or, another grammar-based solution is to split the definition up (which is not a language preserving grammar refactoring since it adds possibility for spaces):

data Expression = left sum: "\\" 'sum' "_{" Assignment a '}^{' Expression until '}' Expression e

(The latter solution will trigger the heuristic for keywords again)

If you don't like to hack the grammar to accomodate highlighting, the other way is to add an annotation via a tree traversal, like so:

visit(yourTree) {
  case t:appl(prod(cilit("\\sum_{"),_,_),_) => t[@category="MetaKeyword"]
}

The code is somewhat hairy because you have to match on and replace a tree which can usually be ignored while thinking of your own language. It's the notion of the syntax rule generated for each (case-insensitive) literal and it's application to the individual characters it consists of. See ParseTree.rsc from the standard library for a detailed and formal definition of what parse trees look like under-the-hood.

To make the latter solution have effect, when you instantiate the IDE using the registerLanguage function from util::IDE, make sure to wrap the call to the parser with some function which executes this visit.

Upvotes: 1

Related Questions