Asif Ansari
Asif Ansari

Reputation: 93

Splitting an alphanumeric string using regex is not working

I want to split the words on two conditions 1) by space 2) by word - starting with 'n' and ending with 't' and contains numeric at last

basically a payment term

Following test case should pass in my case. Can someone explain why regex is not working?

String[] splitedWords3 = new String[] {"payment","term","nt","40","net","00","net", "30"};
Assertions.assertThat("payment term nt40 net00 net30".split("n[a-z]*t\\d+|\\S+")).isEqualTo(splitedWords3);

Upvotes: 0

Views: 303

Answers (1)

anubhava
anubhava

Reputation: 785008

You cannot do split. You should match what you want and keep it in a List like this code:

final String regex = "\\b(n\\w*t)(\\d+)\\b|(\\S+)";
final String string = "payment term nt40 net00 net30";

final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);

List<String> words = new ArrayList<>();
while (matcher.find()) {
   if (matcher.group(3) != null) {
      words.add(matcher.group(3));
   } else {
      words.add(matcher.group(1));
      words.add(matcher.group(2));
   }
}

System.out.println(words);

Output:

[payment, term, nt, 40, net, 00, net, 30]

RegEx Demo

RegEx Details:

  • \b: Word boundary
  • (n\w*t): Match a word that starts with n until we get t and capture in group #1
  • (\d+): Match 1+ digits at end of word and capture in group #2
  • \b: Word boundary
  • |: OR
  • (\S+): Match 1+ non-whitespace characters and capture in group #3

Upvotes: 2

Related Questions