Reputation: 20038
An input string:
string datar = "aag, afg, agg, arg";
I am trying to get matches: "aag" and "arg", but following won't work:
string regr = "a[a-z&&[^fg]]g";
string regr = "a[a-z[^fg]]g";
What is the correct way of ignoring regex matches in C#?
Upvotes: 1
Views: 2290
Reputation: 75222
What you're using is Java's set intersection syntax:
a[a-z&&[^fg]]g
..meaning the intersection of the two sets ('a' THROUGH 'z')
and (ANYTHING EXCEPT 'f' OR 'g')
. No other regex flavor that I know of uses that notation. The .NET flavor uses the simpler set subtraction syntax:
a[a-z-[fg]]g
...that is, the set ('a' THROUGH 'z')
minus the set ('f', 'g')
.
Java demo:
String s = "aag, afg, agg, arg, a%g";
Matcher m = Pattern.compile("a[a-z&&[^fg]]g").matcher(s);
while (m.find())
{
System.out.println(m.group());
}
C# demo:
string s = @"aag, afg, agg, arg, a%g";
foreach (Match m in Regex.Matches(s, @"a[a-z-[fg]]g"))
{
Console.WriteLine(m.Value);
}
Output of both is
aag
arg
Upvotes: 3
Reputation: 11808
Try this if you want match arg
and aag
:
a[ar]g
If you want to match everything except afg
and agg
, you need this regex:
a[^fg]g
Upvotes: 2
Reputation: 49970
The obvious way is to use a[a-eh-z]g
, but you could also try with a negative lookbehind like this :
string regr = "a[a-z](?<!f|g)g"
Explanation :
a
Match the character "a"[a-z]
Match a single character in the range between "a" and "z" (?<!XXX)
Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind)
f|g
Match the character "f" or match the character "g"g
Match the character "g"Upvotes: 3
Reputation: 10349
Regex: a[a-eh-z]g
.
Then use Regex.Matches to get the matched substrings.
Upvotes: 0
Reputation: 361585
Character classes aren't quite that fancy. The simple solution is:
a[a-eh-z]g
If you really want to explicitly list out the letters that don't belong, you could try something like:
a[^\W\d_A-Zfg]g
This character class matches everything except:
\W
excludes non-word characters, i.e. punctuation, whitespace, and other special characters. What's left are letters, digits, and the underscore _
.\d
removes digits so now we have letters and the underscore _
._
removes the underscore so now we only match letters.A-Z
removes uppercase letters so now we only match lowercase letters.All in all way more complicated than we'd likely ever want. That's regular expressions for ya!
Upvotes: 3
Reputation: 112815
It seems like you're trying to match any three alphabetic characters, with the condition that the second character cannot be f
or g
. If this is the case, why not use the following regular expression:
string regr = "a[a-eh-z]g";
Upvotes: 0