Reputation: 8173
I have a RegEx pattern:
@"((?(?!\.\d)\D)*)(\d*\.\d+|\d+)*((?(?<=\d).*))"
designed to break a string into 3 parts. If i have the strings
"asdf1234asdf"
"asdf .1234asdf"
"asdf. .1234asdf"
"asdf 12.34asdf"
"asdf123.4 asdf"
"asdf.1234asdf"
I need:
1. "asdf" 2. "1234" 3. "asdf"
1. "asdf " 2. ".1234" 3. "asdf"
1. "asdf. " 2. ".1234" 3. "asdf"
1. "asdf " 2. "12.34" 3. "asdf"
1. "asdf" 2. "123.4" 3. " asdf"
1. "asdf" 2. ".1234" 3. "asdf"
But depending on the platform i use, the results change.
Regex101.com gives me the results i need
I.e.: I need to change it from
@"((?(?!\.\d)\D)*)(\d*\.\d+|\d+)*((?(?<=\d).*))"
to
@"((?:(?!\.\d)\D)*)(\d*\.\d+|\d+)*((?(?<=\d).*))"
to get it to work in .NET
So why do i need to get rid of the 'if' block? does .NET
not support if blocks?
Upvotes: 4
Views: 552
Reputation:
Obviously Dot-Net doesn't do assertion conditionals correctly.
I wouldn't use these type of conditionals for anything.
Dot-Net does however do Expressional conditionals very well.
All you have to do is wrap any group of constructs inside a conditional group.
Example: (?( expressional construct ) .. | ..)
So, putting assertion inside there works just fine.
Note that Dot-Net is the only engine that supports expressional
conditionals.
It's probably just as well that it is the only conditionals they do correctly.
# @"((?((?!\.\d))\D)*)(\d*\.\d+|\d+)((?((?<=\d)).*))"
( # (1 start)
(?(
(?! \. \d )
)
\D
)*
) # (1 end)
( \d* \. \d+ | \d+ ) # (2)
( # (3 start)
(?(
(?<= \d )
)
.*
)
) # (3 end)
C#:
string [] sAAA = {
"asdf1234asdf",
"asdf .1234asdf",
"asdf. .1234asdf",
"asdf 12.34asdf",
"asdf123.4 asdf",
"asdf.1234asdf",
};
Regex RxAAA = new Regex(@"((?((?!\.\d))\D)*)(\d*\.\d+|\d+)((?((?<=\d)).*))");
for (int i = 0; i < sAAA.Length; i++)
{
Match _mAAA = RxAAA.Match( sAAA[i] );
if (_mAAA.Success)
{
Console.WriteLine("1. = \"{0}\", \t2. = \"{1}\", \t3. = \"{2}\"",
_mAAA.Groups[1].Value, _mAAA.Groups[2].Value, _mAAA.Groups[3].Value );
}
}
Output:
1. = "asdf", 2. = "1234", 3. = "asdf"
1. = "asdf ", 2. = ".1234", 3. = "asdf"
1. = "asdf. ", 2. = ".1234", 3. = "asdf"
1. = "asdf ", 2. = "12.34", 3. = "asdf"
1. = "asdf", 2. = "123.4", 3. = " asdf"
1. = "asdf", 2. = ".1234", 3. = "asdf"
Upvotes: 0
Reputation: 20486
RegEx is more similar to English than it is to C#. It's a language used to define patterns which will find matches within strings. Every language needs to implement their regular expression engine and therefore there are differences between most, while the concepts stay mostly the same. Usually, the more complicated the expression the more likely it isn't cross-platform compatible. That's why everyone will ask SO users what programming language they use when a vague RegEx question is asked.
This is why tools like RegEx101 need to have multiple "flavors" for testing an expression thoroughly. You'll also notice the "Quick Reference" content (cheat sheet containing tokens, quantifiers, etc.) changes as you change between engines.
Wikipedia: Comparison of regular expression engines.
Upvotes: 5