remmaks
remmaks

Reputation: 35

In Java with regular expressions, how to capture numbers from a string with unknown length?

My regular expression looks like this: "[a-zA-Z]+[ \t]*(?:,[ \t]*(\\d+)[ \t]*)*"

I can match the lines with this, but I don't know how to capture the numbers,I think it has to do something with grouping.

For example: from the string "asd , 5 ,2,6 ,8", how to capture the numbers 5 2 6 and 8?

A few more examples:

sdfs6df -> no capture

fdg4dfg, 5 -> capture 5

fhhh3      ,     6,8    , 7 -> capture 6 8 and 7

asdasd1,4,2,7 -> capture 4 2 and 7

So I can continue my work with these numbers. Thanks in advance.

Upvotes: 0

Views: 146

Answers (1)

The fourth bird
The fourth bird

Reputation: 163217

You could match the leading word characters and make use of the \G anchor capturing the continuous digits after the comma.

Pattern

(?:\w+|\G(?!^))\h*,\h*([0-9]+)

Explanation

  • (?: Non capture group
  • \w+ Match 1+ word chars -| or
    • \G(?!^) Assert postition at the end of previous match, not at the start
  • ) Close non capturing group
  • \h*,\h* Match a comma between horizontal whitespace chars
  • ([0-9]+) Capture group 1, match 1+ digits

Regex demo | Java demo

In Java with double escaped backslashes:

String regex = "(?:\\w+|\\G(?!^))\\h*,\\h*([0-9]+)";

Example code

String regex = "(?:\\w+|\\G(?!^))\\h*,\\h*([0-9]+)";
String string = "sdfs6df -> no capture\n\n"
     + "fdg4dfg, 5 -> capture 5\n\n"
     + "fhhh3      ,     6,8    , 7 -> capture 6 8 and 7\n\n"
     + "asdasd1,4,2,7 -> capture 4 2 and 7";

Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println(matcher.group(1));
}

Output

5
6
8
7
4
2
7

Upvotes: 1

Related Questions