Javascript regexes - Lookbehind and lookahead at the same time

I am trying to create a regex in JavaScript that matches the character b if it is not preceded or followed by the character a.

Apparently, JavaScript regexes don't have negative lookbehind readily implemented, making the task difficult. I came up with the following one, but it does not work.

"ddabdd".replace(new RegExp('(?:(?![a]b(?![a])))*b(?![a])', 'i'),"c");

is the best I could come up with. Here, the b should not match because it has a preceding it, but it matches.

So some examples on what I want to achieve

"ddbdd" matches the b
"b" matches the b
"ddb" matches the b
"bdd" matches the b
"ddabdd" or "ddbadd" does not match the b

Upvotes: 0

Answers (3)

user663031

Reputation:

What you are really trying to do here is write a parser for a tiny language. Regexp is good at some parsing tasks, but bad at many (and JS regexps are somewhat underpowered). You may be able to find a regexp to work in a particular situation, then when your syntax rules change, the regexp may be difficult or impossible to change to reflect that. The simple program below has the advantage that it is readable and maintainable. It does exactly what it says.

function find_bs(str) {
    var indexes = [];
    for (var i = 0; i < str.length; i++) {
        if (str[i] === 'b' && str[i-1] !== 'a' && str[i+1] !== 'a')
            indexes.push(i);
    }
    return indexes;
}

Using a regexp

If you absolutely insist on using a regexp, you can use the trick of resetting the lastIndex property on the regexp in conjunction with RegExp.exec:

function find_bs(str) {
    var indexes = [];
    var regexp = /.b[^a]|[^a]b./g;
    var matches;

    while (matches = regexp.exec(str)) {
        indexes.push(matches.index + 1);
        regexp.lastIndex -= 2;
    }

    return indexes;
}

You will need to tweak the logic to handle the beginning and end of the string.

How this works

We find the entire xbx string using the regexp. The index of b will be one plus the index of the match, so we record this. Before we do the next match, we reset lastIndex, which governs the starting point from which the search will continue, back to the b, so it serves as the first character of any following potential match.

Upvotes: 0

nhahtdh

Reputation: 56809

There is no way to emulate the behavior of look-behind with regex alone in this case, since there may be consecutive b in the string, which requires the zero-width property of a look-behind to check the immediately preceding character.

Since the condition in the look-behind is quite simple, you can check for it in the replacement function:

inputString.replace(/b(?!a)/gi, function ($0, idx, str) {
    if (idx == 0 || !/a/i.test(str[idx - 1])) { // Equivalent to (?<!a)
        return 'c';
    } else {
        return $0; // $0 is the text matched by /b(?!a)/
    }
});

Upvotes: 1

hwnd

Reputation: 70722

It seems you could use a capturing group containing either the beginning of string anchor or a negated character class preceding "b" while using Negative Lookahead to assert that "a" does not follow as well. Then you would simply reference $1 inside of the replacement call along with the rest of your replacement string.

var s = 'ddbdd b ddb bdd ddabdd ddabdd ddbadd';
var r = s.replace(/(^|[^a])b(?!a)/gi, '$1c');
console.log(r); //=> "ddcdd c ddc cdd ddabdd ddabdd ddbadd"

Edit: As @nhahtdh pointed out the comment about consecutive characters, you may consider a callback.

var s = 'ddbdd b ddb bdd ddabdd ddabdd ddbadd sdfbbfds';
var r = s.replace(/(a)?b(?!a)/gi, function($0, $1) {
    return $1 ? $0 : 'c';
});
console.log(r); //=> "ddcdd c ddc cdd ddabdd ddabdd ddbadd sdfccfds"

Upvotes: 4

Javascript regexes - Lookbehind and lookahead at the same time

Answers (3)

Using a regexp

Related Questions