Reputation: 2773
I am trying to build a regex for matching strings like
1.) $(Something)
2.) $(SomethingElse, ")")
3.) $(SomethingElse, $(SomethingMore), Bla)
4.) $$(NoMatch) <-- should not match
5.) $$$(ShouldMatch) <-- so basically $$ will produce $
in a text.
EDIT: The words Something, SomethingElse, NoMatch, ShouldMatch can be even other words - they are names of macros. The strings i try to match are "macro calls" which can occur in a text and should be replaced by their result. I need the regex just for syntax highlighting. A complete macro call should be highlighted. Number 3 is currently not so import. Number 1 and 2 are required to work. It's fine if number 4 and 5 will not work like written above but that any $(
after a $
will not match.
Currently I have
(?<!\$)+\$\(([^)]*)\)
Which matches any $(
if there is no leading $
, which could be fine if I will not find another way to apply the $$
structure.
The next step I would like to get done is to ignore the closing bracket if it is in quotes. How could I achieve this?
EDIT So that if I have an input like
Some text, doesn't matter what. And a $(MyMacro, ")") which will be replaced.
The complete '$(MyMacro, ")")'
will get highlighted.
I already have this expression
"(?:\\\\|\\"|[^"])*"
for quotes including escaping of quotes. But I don't know how to apply this in a way to ignore everything between them...
P.S. I am using .NET to apply the regular expressions. So balanced groups will be supported. I just don't know how to apply all this.
Upvotes: 6
Views: 2710
Reputation: 10169
Things like this are complicated... so don't get scared of the following:
RegEx: (?<!\$)(?:\$\$)*(\$\((?:[\w, ]+|(?>"(?:(?<=\\)"|[^"])+")|(?1)+)*\))
Explained demo here: http://regex101.com/r/yZ5dI7
This follows all your 5 points, will match the first 3 macro types and even deeper variations with multiple "
or macro-inside-macro only when the number of $
prefixing it is odd.
Upvotes: 1
Reputation: 33918
You can use an expression like this:
(?<! \$ ) # not preceded by $
\$ (?: \$\$ )? # $ or $$$
\( # opening (
(?> # non-backtracking atomic group
(?> # non-backtracking atomic group
[^"'()]+ # literals, spaces, etc
| " (?: [^"\\]+ | \\. )* " # double quoted string with escapes
| ' (?: [^'\\]+ | \\. )* ' # single quoted string with escapes
| (?<open> \( ) # open += 1
| (?<close-open> \) ) # open -= 1, only if open > 0 (balancing group)
)*
)
(?(open) (?!) ) # fail if open > 0
\) # final )
Which can be quoted as above. For example in C#:
var regex = new Regex(@"(?x) # enable eXtended mode (ignore spaces, comments)
(?<! \$ ) # not preceded by $
\$ (?: \$\$ ) # $ or $$$
\( # opening (
(?> # non-backtracking atomic group
(?> # non-backtracking atomic group
[^""'()]+ # literals, spaces, etc
| "" (?: [^""\\]+ | \\. )* "" # double quoted string with escapes
| ' (?: [^'\\]+ | \\. )* ' # single quoted string with escapes
| (?<open> \( ) # open += 1
| (?<close-open> \) ) # open -= 1, only if open > 0 (balancing group)
)*
)
(?(open) (?!) ) # fail if open > 0
\) # final )
");
Upvotes: 5
Reputation: 1478
For the part whitout the macro as a param (1 and 2) you can do :
(?<!\$)+\$\(([^)]*?("[^"]*?")?)+\)
You can see here here
In the case whith the macro (3) you can do :
(?<!\$)+\$\(([^)]*?("[^"]*?")?(\$\([^)]*?\))?)+\)
But this will not work for macro containing string whith parenthesis.
You can see the result here
Upvotes: 1
Reputation: 2852
I recently was in search of similar regex , but decided that it would be faster to parse the text with C# than a regex as my regex skills are bad ... so i wrote this method to remove Razor code blocks.
you can easily modify it to match your needs without complex regex expressions
Upvotes: 0