JoanComasFdz
JoanComasFdz

Reputation: 3066

AvalonEdit highlight everything after a word and before the next space?

I'm using AvalonEdit in an app that runs my own custom-built language. I have defined a highlighting.xml file that works just fine.

Now I am trying to extend it according to:

the next word appearing after "method" is colored blue.

I came up with this regex to do so:

(?s)(?<=method )(.+?)(?= )

And tested it with this input:

via method AMethod on interface

Which works fine with http://regexstorm.net/tester.

Then I tried the following rules, but none worked. With them nothing gets highlighted anymore.

<Rule foreground="DarkBlue">
  \(?s)(?<=method )(.+?)(?= )
</Rule>

<Rule foreground="DarkBlue">
  \(?s)(?&lt;=method )(.+?)(?= )
</Rule>

<Rule foreground="DarkBlue">
  (?s)(?<=method )(.+?)(?= )
</Rule>

This one did not break the highlighting, but did not work either:

<Rule foreground="DarkBlue">
  (?s)(?&lt;=method )(.+?)(?= )
</Rule>

Is what I am trying to do possible? Is the regex correct? I am a complete ignorant on regex.

Thanks in advance.

Update for Divisadero's answer

This ones break the Highlighting.

<Rule foreground="DarkBlue">
  \(?s)(?<=method )([^' ']+)
</Rule>

<Rule foreground="DarkBlue">
  \(?s)(?&lt;=method )([^' ']+)
</Rule>

<Rule foreground="DarkBlue">
  (?s)(?<=method )([^' ']+)
</Rule>

This ones don't break the Highlighting but don't work:"

<Rule foreground="DarkBlue">
  (?s)(?&lt;=method )([^' ']+)
</Rule>

Upvotes: 0

Views: 502

Answers (2)

Alan Moore
Alan Moore

Reputation: 75242

It doesn't surprise me that rules based on lookbehind don't work. A syntax highlighter is just a glorified lexer, which means it doesn't use regexes the way you might expect. Instead of searching for a match, it steps through the string manually, always acting as if (1) the current position is the beginning of the string, and (2) the regex has a start anchor (\A) on the front of it. So lookbehinds aren't illegal, but they don't work; positive lookbehinds like (?<=method ) always fail, and negative lookbehinds always succeed.

But you shouldn't need a lookbehind anyway. In lexing most languages, you can identify a user-defined name because it looks like a name and it hasn't already been consumed by another rule (string, comment, keyword...). In your example, via, method, on and interface all look like keywords, so they should be included in your <Keywords> rule. Then you can add another rule for user-defined names, like:

<!-- name -->
<Rule foreground="DarkBlue">
  \b\w+\b
</Rule>

(That regex is just a guess, but--fun fact--the \w shorthand was invented for exactly this purpose.) If you want to differentiate between method names and other names, you can add another rule, before that one, with a more specific regex:

<!-- method name -->
<Rule foreground="LightBlue">
  \b[A-Z]\w*\b
</Rule>

By the way, the (?s) modifier allows the dot (.) to match any character including newlines. It probably has no effect here, since the highlighter processes one line at a time, but it's definitely not doing any good.

Upvotes: 1

Divisadero
Divisadero

Reputation: 913

If all you want is to highlight name after method, use:

(?s)(?<=method )([a-zA-Z0-9])+  

'[a-zA-Z0-9]+' part should accept whatever symbols you accept in the name.

And if you really somehow needs everything but space, just use:

(?s)(?<=method )([^' ']+) 

Upvotes: 1

Related Questions