Reputation: 391
Objective: Regex Matching
For this example I'm interested in matching a "|" pipe character. I need to match it if it's alone: "aaa|aaa" I need to match it (the last pipe) only if it's preceded by pairs of pipe: (2,4,6,8...any even number)
Another way: I want to ignore ALL pipe pairs "||" (right to left) or I want to select bachelor bars only (the odd man out)
string twomatches = "aaaaaaaaa||||**|**aaaaaa||**|**aaaaaa";
string onematch = "aaaaaaaaa||**|**aaaaaaa||aaaaaaaa";
string noMatch = "||";
string noMatch = "||||";
I'm trying to select the last "|" only when preceded by an even sequence of "|" pairs or in a string when a single bar exists by itself. Regardless of the number of "|"
Upvotes: 2
Views: 271
Reputation: 18515
Oh, it's reopened! If you need better performance, also try this negative improved version.
\|(?!\|)(?<!(?:[^|]|^)(?:\|\|)*)
The idea here is to first match the last literal |
at right side of a sequence or single |
and execute a negated version of the lookbehind just after the match. This should perform considerably better.
\|(?!\|)
matches literal |
IF NOT followed by another pipe character (right most if sequence).(?<!(?:[^|]|^)(?:\|\|)*)
IF position right after the matched |
IS NOT preceded by (?:\|\|)*
any amount of literal ||
until a non|
or ^
start.Btw, there is no performance gain in using \|{2}
over \|\|
it might be better readable.
Upvotes: 0
Reputation: 627077
You may use the following regex to select just odd one pipe out:
(?<=(?<!\|)(?:\|{2})*)\|(?!\|)
See regex demo.
The regex breakdown:
(?<=(?<!\|)(?:\|{2})*)
- if a pipe is preceded with an even number of pipes ((?:\|{2})*
- 0 or more sequences of exactly 2 pipes) from a position that has no preceding pipe ((?<!\|)
)\|
- match an odd pipe on the right(?!\|)
- if it is not followed by another pipe.Please note that this regex uses a variable-width look-behind and is very resource-consuming. I'd rather use a capturing group mechanism here, but it all depends on the actual purpose of matching that odd pipe.
Here is a modified version of the regex for removing the odd one out:
var s = "1|2||3|||4||||5|||||6||||||7|||||||";
var data = Regex.Replace(s, @"(?<!\|)(?<even_pipes>(?:\|{2})*)\|(?!\|)", "${even_pipes}");
Console.WriteLine(data);
See IDEONE demo. Here, the quantified part is moved from lookbehind to an even_pipes named capturing group, so that it could be restored with the backreference in the replaced string. Regexhero.net shows 129,046 iterations per second for the version with a capturing group and 69,206 with the original version with variable-width lookbehind.
Only use variable-width look-behind if it is absolutely necessary!
Upvotes: 1