arnsholt
arnsholt

Reputation: 871

Regex that matches before and after certain characters

I am trying to craft a delimiter regex (for use with java.util.Scanner) that segments a string on whitespace, as well as keeping colons, opening parenthesis and closing parenthesis as separate tokens. That is, foo(a:b) should segment into the tokens foo, (, a, :, b and ).

My current best effort is the pattern "\\s+|(?=[(:])|(?<=[:)])" which for some reason I can't understand fails to match after the opening parenthesis and before the closing parenthesis, but matches fine on both sides of the colon.

Upvotes: 3

Views: 77

Answers (1)

The fourth bird
The fourth bird

Reputation: 163217

If you want all those separate parts, you could extend the character classes asserting one of the characters [(:)] at the left and, if this is the whole string, assert one of the characters [(:] at the right.

If you also want to match the position after the last closing parenthesis, both character classes can be the same [(:)]

\s+|(?=[(:)])|(?<=[(:])

Regex demo | Java demo

Example code

String s = "foo(a:b)";
Scanner scanner = new Scanner(s).useDelimiter("\\s+|(?=[(:)])|(?<=[(:])");
while(scanner.hasNext())
{
    System.out.println(scanner.next());
}

Output

foo
(
a
:
b
)

Upvotes: 2

Related Questions