Thomas
Thomas

Reputation: 113

Optional parts in javascript regular expression (with capture groups)

I have a question regarding how to implement optional parts to a regular expression. I have taken an example from parsing good-old text adventure input. This highlights my task pretty well. Here is an example to show what I'm after:

var exp = /^([a-z]+)(?:\s([a-z0-9\s]+)\s(on|with)\s([a-z\s]+))?$/i;

var strings = [
    "look",
    "take key",
    "take the key",
    "put key on table",
    "put the key on the table",
    "open the wooden door with the small rusty key"
];

for (var i=0; i < strings.length;i++) {
    var match = exp.exec(strings[i]);

    if (match) {
        var verb = match[1];
        var directObject = match[2];
        var preposition = match[3];
        var indirectObject = match[4];

        console.log("String: " + strings[i]);
        console.log("  Verb: " + verb);
        console.log("  Direct object: " + directObject);
        console.log("  Preposition: " + preposition);
        console.log("  Indirect object: " + indirectObject);    
    } else {
        console.log("String is not a match: " + strings[i]);
    }
    console.log(match);
}

My regular expression works for the first and the three last strings.

I know how to get the correct result using other methods (like .split()). This is an attempt to learn regular expressions so I'm not looking for an alternative way to do this :-)

I have tried adding more optional non-capture groups, but I couldn't get it to work:

var exp = /^([a-z]+)(?:\s([a-z0-9\s]+)(?:\s(on|with)\s([a-z\s]+))?)?$/i;

This works for the three first string, but not the three last.

So what I want is: first word, some characters until a specified word (like "on"), some characters until end of string

The tricky part is the different variants.

Can it be done?

WORKING SOLUTION:

exp = /^([a-z]+)(?:\s((?:(?!\s(?:on|with)).)*)(?:\s(on|with)\s(.*))?)?$/i;

Upvotes: 4

Views: 9594

Answers (1)

Samuel Caillerie
Samuel Caillerie

Reputation: 8275

Perhaps some regex like this :

var exp = /^([a-z]+)(?:(?:(?!\s(?:on|with))(\s[a-z0-9]+))+(?:\s(?:on|with)(\s[a-z0-9]+)+)?)?$/i;

The group \s[a-z0-9]+ captures a word preceded by a space.

(?!\s(?:on|with)) avoids this word to be "on" or "with".

Thus (?:(?!\s(?:on|with))(\s[a-z0-9]+))+ is the list of words before "on" or "with".

You can test here.

Upvotes: 2

Related Questions