Reputation: 21
Let's say I have a string:
String sentence = "My nieces are Cara:8 Sarah:9 Tara:10";
And I would like to find all their respective names and ages with the following pattern matcher:
String regex = "My\\s+nieces\\s+are((\\s+(\\S+):(\\d+))*)";
Pattern pattern = Pattern.compile;
Matcher matcher = pattern.matcher(sentence);
I understand something like
matcher.find(0); // resets "pointer"
String niece = matcher.group(2);
String nieceName = matcher.group(3);
String nieceAge = matcher.group(4);
would give me my last niece (" Tara:10"
, "Tara"
, "10"
,).
How would I collect all of my nieces instead of only the last, using only one regex/pattern?
I would like to avoid using split string.
Upvotes: 2
Views: 74
Reputation: 18490
Another idea is to use the \G
anchor that matches where the previous match ended (or at start).
String regex = "(?:\\G(?!\\A)|My\\s+nieces\\s+are)\\s+(\\S+):(\\d+)";
My\s+nieces\s+are
matches\G
will chain matches from there(?!\A)
neg. lookahead prevents \G
from matching at \A
start\s+(\S+):(\d+)
using two capturing groups for extractionSee this demo at regex101 or a Java demo at tio.run
Matcher m = Pattern.compile(regex).matcher(sentence);
while (m.find()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
}
Upvotes: 2
Reputation: 50716
You can't iterate over repeating groups, but you can match each group individually, calling find()
in a loop to get the details of each one. If they need to be back-to-back, you can iteratively bound your matcher to the last index, like this:
Matcher matcher = Pattern.compile("My\\s+nieces\\s+are").matcher(sentence);
if (matcher.find()) {
int boundary = matcher.end();
matcher = Pattern.compile("^\\s+(\\S+):(\\d+)").matcher(sentence);
while (matcher.region(boundary, sentence.length()).find()) {
System.out.println(matcher.group());
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
boundary = matcher.end();
}
}
Upvotes: 2