Matheus Simon
Matheus Simon

Reputation: 696

.NET Regex not matching properly

I'm trying to match quoted strings with the literal quote being accepted like:

"message\""

@"message"

with

@(["'])[\S\s]*?\1|(["'])(?:\\\2|(?!\\\2)(?!\2).)*\2

but for

"message: \"" + message + "\"

the built-in Regex in .NET matches only "message: \" instead of "message: \"" like it should according to online matchers like:

https://regexr.com/4173n

Does anyone know how to make it work properly?

.NET Code:

string pattern = "([\"'])[\\S\\s]*?\\1|([\"'])(?:\\\\\\2|(?!\\\\\\2)(?!\\2).)*\\2";
string test = "\"message: \\\"\" + message + \"\\\".\n";
MatchCollection matches = Regex.Matches(test, pattern);

Upvotes: 3

Views: 105

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You left out a @ in the pattern and forgot to escape the literal backslash pattern, that must contain 4 backslashes in the regular string literal.

The literal string regex will look like

@(["'])[\S\s]*?\1|(["'])(?:\\\2|(?!\\\2)(?!\2).)*\2

If you want to use a regular string literal

string pattern = "@([\"'])[\\S\\s]*?\\1|([\"'])(?:\\\\\\2|(?!\\\\\\2)(?!\\2).)*\\2";

Or a verbatim string literal where you only need to escape a " with another ":

string pattern = @"@([""'])[\S\s]*?\1|([""'])(?:\\\2|(?!\\\2)(?!\2).)*\2";

Upvotes: 1

Poul Bak
Poul Bak

Reputation: 10929

You need this Regex instead:

@"^(?<quote>(?<![\\])['""])((.(?!(?<![\\])\k<quote>))*.?)\k<quote>"

It does, what you want. matches the qoutes and everything between.

It's actually not my regex, but it Works in your case.

It Works by storing the quote character (either single or double quote) in a capturing Group, then it looks for this, ignoring any escaped quotes.

Edit: If you don't like @-quoted strings, here's the normal string (escaped):

string pattern = "^(?<quote>(?<![\\])['\"])((.(?!(?<![\\])\k<quote>))*.?)\k<quote>";

Upvotes: 0

Related Questions