Reputation: 406
I am trying to get the scanner split a string on every @ symbol, except when escaped (or at the start of a line)
My RegEx:
(?:[^\\])@
(?: // Start of non-capturing group (0)
[ // Match any characters in square brackets [
^\\ // Match any non-\ character.
] // ]
) // End of non-capturing group (0)
@ // Match literal '@'
From, my understanding, this should work for my intentions.
However when using this pattern in a scanner, it simply ignores the fact that the non-capturing group should not be counted towards the delimiter, simply to match against, the delimiter (the part to be removed/split at) should be just '@'. So for the following example String: "Hello@World", The result would have to be ["Hello", "World"].
Except running below code sample:
private static void test() {
try (Scanner sc = new Scanner("test@here")) {
sc.useDelimiter("(?:[^\\\\])@"); // Every unescaped @ sign.
while (sc.hasNext()) {
String token = sc.next();
System.out.println(token);
}
}
}
yields:
tes
here
instead of the expected:
test
here
Upvotes: 2
Views: 147
Reputation: 406
The Scanner doesn't use capturing groups like replace all.
Instead you should use negative look behind. So your pattern would look like this instead:
(?<!\\)@
This also cleans up the negation class required.
Where the :
is simply replaced with the <!
To make the non-capturing group, a negative look behind group.
Upvotes: 2
Reputation: 13682
The delimiter is considered the entire match without any regard to groups, capturing or not-capturing.
What you need is a lookbehind pattern, and the syntax is easier here with a negative lookbehind.
sc.useDelimiter("(?<!\\\\)@");
Upvotes: 5