D.Q.
D.Q.

Reputation: 547

A Java regular expression about finding digit string

I was learning a Java regular expression tutorial online and got confused about one small program.

  // String to be scanned to find the pattern.
  String line = "This order was places for QT3000! OK?";
  String pattern = "(.*)(\\d+)(.*)";

  // Create a Pattern object
  Pattern r = Pattern.compile(pattern);

  // Now create matcher object.
  Matcher m = r.matcher(line);
  if (m.find( )) {
     System.out.println("Found value: " + m.group(0) );
     System.out.println("Found value: " + m.group(1) );
     System.out.println("Found value: " + m.group(2) );
  } 

And the results printed out are:

Found value: This order was places for QT3000! OK?

Found value: This order was places for QT300

Found value: 0

I have no idea why the group(1) gets value the above value? Why it stops before the last zero of 'QT3000'?

Thank you very much!

Upvotes: 2

Views: 242

Answers (2)

ilomambo
ilomambo

Reputation: 8350

Actually you got the group numbers wrong.

Group 0 will always be the whole string to match

Group 1 will be the match for (.*) which is called "greedy" because it will match as many characters as possible (in your case "This order was places for QT300")

Group 2 is the match for (\d+) which is the minimum possible to match the regex (in your case it is "0")

Group 3 (which you did not print) is the last (.*) and should match "! OK" ( The "?" is a special regex character, if you want to match it litterally prefix it with \)

If you want to match the 3000 on group 2 use this regex:

String pattern = "(.*?)(\\d+)(.*)";

Upvotes: 0

David M
David M

Reputation: 72850

The first group of (.*) (this is index 1 - index 0 is the overall regular expression) is a greedy match. It captures as much as it can while letting the overall expression still match. Thus it can take up to the second 0 in the string, leaving just 0 to match (\\d+). If you want different behaviour, then you should read up on greedy and non-greedy matches, or find a more appropriate pattern.

Upvotes: 2

Related Questions