Will
Will

Reputation: 2978

java.util.regex.Matcher confused group

I'm having trouble getting the right group of a regex match. My code boils down to following:

Pattern fileNamePattern = Pattern.compile("\\w+_\\w+_\\w+_(\\w+)_(\\d*_\\d*)\\.xml");
Matcher fileNameMatcher = fileNamePattern.matcher("test_test_test_test_20110101_0000.xml");

System.out.println(fileNameMatcher.groupCount());

if (fileNameMatcher.matches()) {
    for (int i = 0; i < fileNameMatcher.groupCount(); ++i) {
        System.out.println(fileNameMatcher.group(i));
    }
}

I expect the output to be:

2
test
20110101_0000

However its:

2
test_test_test_test_20110101_0000.xml
test

Does anyone have an explanation?

Upvotes: 1

Views: 2934

Answers (4)

RobbySherwood
RobbySherwood

Reputation: 361

actually your for loop should INCLUDE groupCount() using "<=" :

for (int i = 0; i <= fileNameMatcher.groupCount(); ++i) {
    System.out.println(fileNameMatcher.group(i));
}

thus your output then will be:

2
test_test_test_test_20110101_0000.xml
test
20110101_0000

the groupCount() will not count group 0 matching the whole string.

first group will be "test" as matched by (\w+) and

second group will be "20110101_0000" as matched by (\d*_\d*)

Upvotes: 2

Frank Schmitt
Frank Schmitt

Reputation: 30775

Group(0) is the whole match, and group(1), group(2), ... are the sub-groups matched by the regular expression.
Why do you expect "test" to be contained in your groups? You didn't define a group to match test (your regex contains only the group \d*_\d*).

Upvotes: 6

NPE
NPE

Reputation: 500327

  • group(0) should be the entire match ("test_test_test_test_20110101_0000.xml");
  • group(1) should be the sole capture group in your regex ("20110101_0000").

This is what I am getting. I am puzzled as to why you'd be getting a different value for group(1).

Upvotes: 2

axtavt
axtavt

Reputation: 242686

Group 0 is the whole match. Real groups start with 1, i.e. you need this:

System.out.println(fileNameMatcher.group(i + 1)); 

Upvotes: 2

Related Questions