Reputation: 8962
I have a string that I want to split into an array:
SEQUENCE: 1A→2B→3C
I tried the following regular expression:
((.*\s)|([\x{2192}]*))
1. \x{2192} is the arrow mark
2. There is a space after the colon, I used that as a reference for matching the first part
and it works in testers(Patterns in OSX)
but it splits the string into this:
[, , 1, A, , 2, B, , 3, C]
How can I achieve the following?:
[1A,2B,3C]
This is the test code:
String str = "SEQUENCE: 1A→2B→3C"; //Note that there's an extra space after the colon
System.out.println(Arrays.toString(str.split("(.*\\s)|([\\x{2192}]*)")));
Upvotes: 0
Views: 942
Reputation: 223083
As noted in Richard Sitze's post, the main problem with the regex is that it should use +
rather than *
. Additionally, there are further improvements you can make to your regex:
\\x{2192}
, use \u2192
. And because it's a single character, you don't need to put it into a character class ([...]
), you can just use \u2192+
directly.|
binds more loosely than .*\\s
and \u2192+
, you won't need the parentheses there either. So your final expression is simply ".*\\s|\u2192+"
.Upvotes: 5
Reputation: 8463
The \u2192*
will match 0 or more arrows - which is why you're splitting on every character (splitting on empty string). Try changing *
to +
.
Upvotes: 5