Reputation: 3683
I have a string like
String string = "number0 foobar number1 foofoo number2 bar bar bar bar number3 foobar";
I need a regex to give me the following output:
number0 foobar
number1 foofoo
number2 bar bar bar bar
number3 foobar
I have tried
Pattern pattern = Pattern.compile("number\\d+(.*)(number\\d+)?");
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group());
}
but this gives
number0 foobar number1 foofoo number2 bar bar bar bar number3 foobar
Upvotes: 7
Views: 1118
Reputation: 1
Pattern pattern = Pattern.compile("\\w+\\d(\\s\\w+)\1*");
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group());
}
Upvotes: 0
Reputation: 11958
because .*
is a greedy pattern. use .*?
instead of .*
Pattern pattern = Pattern.compile("number\\d+(.*?)(number\\d+)");
Matcher matcher = pattern.matcher(string);
while(matcher.find();){
out(matcher.group());
}
Upvotes: 0
Reputation: 28074
Why don't you just match for number\\d+
, query the match location, and do the String splitting yourself?
Upvotes: 0
Reputation: 9340
(.*)
part of your regex is greedy, therefore it eats everything from that point to the end of the string. Change to non-greedy variant: (.*)?
http://docs.oracle.com/javase/tutorial/essential/regex/quant.html
Upvotes: -1
Reputation: 115328
If "foobar" is just an example and really you mean "any word" use the following pattern: (number\\d+)\s+(\\w+)
Upvotes: 0
Reputation: 336078
So you want number
(+ an integer) followed by anything until the next number
(or end of string), right?
Then you need to tell that to the regex engine:
Pattern pattern = Pattern.compile("number\\d+(?:(?!number).)*");
In your regex, the .*
matched as much as it could - everything until the end of the string. Also, you made the second part (number\\d+)?
part of the match itself.
Explanation of my solution:
number # Match "number"
\d+ # Match one of more digits
(?: # Match...
(?! # (as long as we're not right at the start of the text
number # "number"
) # )
. # any character
)* # Repeat as needed.
Upvotes: 10