swshaun
swshaun

Reputation: 384

Java Regex Matching Groups Bug?

I have written a basic CLI using swing components and am using Regex to recognise commands.

I have happened across something peculiar which I am at a loss to explain. What am I doing wrong here?

This is the code I have:

class GraphCommandFactory {

private final GraphSearchController controller;
private final GraphSearchModel model;
private final ArrayList<Pattern> commands;

public GraphCommandFactory(GraphSearchController controller, GraphSearchModel model) {
    this.model = model;
    this.controller = controller;
    this.commands = new ArrayList<>();

    commands.add(Pattern.compile("SET START ([A-Z]{4,8})"));
}

public Command createCommand(String commandString) {
    Command returnCommand;

    // Test the string against each regex
    int command = 0;
    Matcher matcher = commands.get(command).matcher(commandString);
    ...

private String[] extractArguments(Matcher matcher) {
    String[] arguments = new String[matcher.groupCount()];

    for (int i = 0, j = matcher.groupCount(); i < j; i++) {
        arguments[i] = matcher.group(i);
    }

    return arguments;
}

The problem comes with the extractArguments function... Using the pattern (in the Matcher):

Pattern.compile("SET START ([A-Z]{4,8})"));

loses the last group. However, if I amend it to:

Pattern.compile("SET START ([A-Z]{4,8})()"));

Then it correctly captures what I want.

Have I misunderstood how regexes, Pattern and Matcher are to be used? Or is this a bug where the last capturing group is simply lost?

I am using Java SDK 1.8 and Netbeans as my IDE. Using the debug facility leaves me none-the-wiser.

Upvotes: 2

Views: 203

Answers (1)

anubhava
anubhava

Reputation: 785146

Problem is in your for loop:

for (int i = 0, j = matcher.groupCount(); i < j; i++) {
    arguments[i] = matcher.group(i);
}

As you are only looping up to 1 less than the matcher.groupCount

Change this to:

for (int i = 0; i <= matcher.groupCount(); i++) {
    arguments[i] = matcher.group(i);
}

As per Javadoc:

groupCount returns the number of capturing groups in this matcher's pattern. Group zero denotes the entire pattern by convention. It is not included in this count.

Upvotes: 7

Related Questions