secluded
secluded

Reputation: 517

Regex expression works in find, but not in .sublime-syntax file

I want to highlight anything between $( and $). In the .sublime-syntax file, I write:

%YAML 1.2
---
name: Metamath
file_extensions: mm
scope: source.mm

contexts:
  main:
    - match: \$\([\w\W]*?\$\)
      scope: comment

This matches correctly when I use it in find in sublime, but fails when used in the YAML file.

Any suggestions about a better way to do this would be appreciated.

Upvotes: 1

Views: 87

Answers (1)

OdatNurd
OdatNurd

Reputation: 22781

Your underlying issue is that Sublime syntax highlighting happens on a line by line basis, so any regular expression that needs to match across multiple lines will not match anything because it never gets enough input. In order to perform this kind of match you need to use the context stack to keep track of state on a line by line basis.

The syntax documentation has more information on the topic, but essentially the idea is that you temporarily replace (push onto the context stack) a new set of syntax rules for a specific situation, which takes over the syntax highlighting until there is a pop to remove those rules and go back to what was happening before.

This is perhaps best demonstrated by an example modification of the syntax in your question:

%YAML 1.2
---
name: Metamath
file_extensions: mm
scope: source.mm

contexts:
  main:
    - match: \$\(
      scope: punctuation.definition.comment.begin
      push:
        - meta_scope: comment
        - match: \$\)
          scope: punctuation.definition.comment.end
          pop: true

Here the single rule in the main context (where all syntax matching starts) is only interested in matching the sequence \$\(, so any text other than this will be ignored and end up being plain text.

When a $( is matched, a couple of things happen. First, it gets assigned the scope punctuation.definition.comment.begin to mark it as a comment start sequence. Second, it pushes an anonymous context onto the context stack with rules that assume that they are inside of a comment.

While this context is active on the stack, the only match rules that apply are the ones in that context, which match \$\) and nothing else. Any text that is not this token is ignored, but when that token is seen, it is assigned a scope that marks it as the end of the comment, and then the pop removes this anonymous context from the stack, to go back to what the rules were before the comment was started (which here only match other comments)

The meta_scope says that while this context is on the top of the stack, all text should be given the scope comment in addition to any other scope it might have. This applies also to the text that caused this context to be pushed (the $() as well as the text that causes the context to be popped (the $)).

The result is that everything starting at $( and ending in $), including those tokens, is marked as comment, while the start and end tokens are also assigned scoped that indicate that they are starting comments (not strictly required, but is good form nonetheless).

You can also use specifically named context items instead of an anonymous context as seen here; generally that is a better solution for cases where the rules might be needed in more than one place, or if there are just many rules to match.

%YAML 1.2
---
name: Metamath
file_extensions: mm
scope: source.mm

contexts:
  main:
    - match: \$\(
      scope: punctuation.definition.comment.begin
      push: comment_rules

  comment_rules:
    - meta_scope: comment
    - match: \@\w*
      scope: keyword.other
    - match: \$\)
      scope: punctuation.definition.comment.end
      pop: true

This example is mostly identical to the first one, but now the comment_rules context is specifically named and used by name. There is an extra entry which matches any word that starts with @ and highlights it differently, but only inside of comments.

Sample of syntax in action

By doing it this way, you can have multiple situations in which comment rules apply without having to duplicate things (although that is admittedly contrived in this example).

When there are multiple rules in the context such as in this example, this can be a bit cleaner to look at; for something like your original example, arguably the anonymous context is a little easier to read and makes things a little clearer and more self contained.

Upvotes: 2

Related Questions