Reputation: 93
I want to split the words on two conditions 1) by space 2) by word - starting with 'n' and ending with 't' and contains numeric at last
basically a payment term
Following test case should pass in my case. Can someone explain why regex is not working?
String[] splitedWords3 = new String[] {"payment","term","nt","40","net","00","net", "30"};
Assertions.assertThat("payment term nt40 net00 net30".split("n[a-z]*t\\d+|\\S+")).isEqualTo(splitedWords3);
Upvotes: 0
Views: 303
Reputation: 785008
You cannot do split
. You should match what you want and keep it in a List
like this code:
final String regex = "\\b(n\\w*t)(\\d+)\\b|(\\S+)";
final String string = "payment term nt40 net00 net30";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
List<String> words = new ArrayList<>();
while (matcher.find()) {
if (matcher.group(3) != null) {
words.add(matcher.group(3));
} else {
words.add(matcher.group(1));
words.add(matcher.group(2));
}
}
System.out.println(words);
Output:
[payment, term, nt, 40, net, 00, net, 30]
RegEx Details:
\b
: Word boundary(n\w*t)
: Match a word that starts with n
until we get t
and capture in group #1(\d+)
: Match 1+ digits at end of word and capture in group #2\b
: Word boundary|
: OR(\S+)
: Match 1+ non-whitespace characters and capture in group #3Upvotes: 2