Reputation: 547
I was learning a Java regular expression tutorial online and got confused about one small program.
// String to be scanned to find the pattern.
String line = "This order was places for QT3000! OK?";
String pattern = "(.*)(\\d+)(.*)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find( )) {
System.out.println("Found value: " + m.group(0) );
System.out.println("Found value: " + m.group(1) );
System.out.println("Found value: " + m.group(2) );
}
And the results printed out are:
Found value: This order was places for QT3000! OK?
Found value: This order was places for QT300
Found value: 0
I have no idea why the group(1) gets value the above value? Why it stops before the last zero of 'QT3000'?
Thank you very much!
Upvotes: 2
Views: 242
Reputation: 8350
Actually you got the group numbers wrong.
Group 0 will always be the whole string to match
Group 1 will be the match for (.*) which is called "greedy" because it will match as many characters as possible (in your case "This order was places for QT300")
Group 2 is the match for (\d+) which is the minimum possible to match the regex (in your case it is "0")
Group 3 (which you did not print) is the last (.*) and should match "! OK" ( The "?" is a special regex character, if you want to match it litterally prefix it with \)
If you want to match the 3000 on group 2 use this regex:
String pattern = "(.*?)(\\d+)(.*)";
Upvotes: 0
Reputation: 72850
The first group of (.*)
(this is index 1 - index 0 is the overall regular expression) is a greedy match. It captures as much as it can while letting the overall expression still match. Thus it can take up to the second 0
in the string, leaving just 0
to match (\\d+)
. If you want different behaviour, then you should read up on greedy and non-greedy matches, or find a more appropriate pattern.
Upvotes: 2