mark
mark

Reputation: 62732

Java regular expressions - what is wrong with this code

I am trying to extract the name of a property referenced in a string using the $() construct. For instance, if bb=xo-xo, then "aa$(bb)aa" expands to "aaxo-xoaa".

Here is the code:

public static void main(String[] args) {
  final String PROPERTY_NAME_REGEX = "\\w+(?:\\.\\w+)*";
  final String PROPERTY_REFERENCE_REGEX = "\\$\\((" + PROPERTY_NAME_REGEX + ")\\)";
  Pattern pattern = Pattern.compile(PROPERTY_REFERENCE_REGEX);
  String value = "hhh $(aa.bbcc.dd) @jj $(aakfd) j";
  Matcher matcher = pattern.matcher(value);
  StringBuffer sb = new StringBuffer();
  while (matcher.find()) {
    System.out.println(String.format("\"%s\" at [%d-%d)",
      matcher.group(),
      matcher.start(),
      matcher.end()));
    for (int i = 0; i < matcher.groupCount(); ++i) {
      System.out.println(String.format("group[%d] = %s", i, matcher.group(i)));
    }
  }
}

And it displays:

"$(aa.bbcc.dd)" at [4-17)
group[0] = $(aa.bbcc.dd)
"$(aakfd)" at [22-30)
group[0] = $(aakfd)

But I was hoping to get the following output:

"$(aa.bbcc.dd)" at [4-17)
group[0] = aa.bbcc.dd
"$(aakfd)" at [22-30)
group[0] = aakfd

What am I doing wrong?

Upvotes: 1

Views: 147

Answers (2)

erikxiv
erikxiv

Reputation: 4075

Group 0 is always the entire match, regardless of any specified capturing groups. To top this off, Matcher.groupCount() returns the number of capturing groups, excluding the entire match. To get the result you were after, change your for loop to the following (notice that it starts at 1, and continues one step further due to the added equal sign):

for (int i = 1; i <= matcher.groupCount(); i++) {

Upvotes: 0

Amber
Amber

Reputation: 526563

To answer your specific problem, you should be looking at group[1], not group[0].

The Matcher.groupCount() method does not include group[0] in the count, thus your for loop is never showing you the group[1] matches because i < matcher.groupCount() is false.

Change your condition to i <= matcher.groupCount() and your output will be more enlightening.

That said, there are better ways of doing this than writing your own regex - e.g. http://api.dpml.net/ant/1.6.4/org/apache/tools/ant/filters/ExpandProperties.html

Upvotes: 2

Related Questions