MRVDOG
MRVDOG

Reputation: 1717

Regexp match spaces not followed be a specific word

I have spent the last couple of hours trying to figure out how to match all whitespace (\s) unless followed by AND\s or preceded by \sAND.

I have this so far

\s(?!AND\s)

but it is then matching the space after \sAND, but I don't want that.

Any help would be appreciated.

Upvotes: 2

Views: 391

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626893

Often, when you want to split by a single character that appears in specific context, you can replace the approach with a matching one.

I suggest matching all sequences of non-whitespace characters joined with AND enclosed with whitespace ones before and then match any other non-whitespace sequences. Thus, we'll ensure we get an array of necessary substrings:

\S+\sAND\s\S+|\S+

See regex demo

I assume the \sAND\s pattern appears between some non-whitespace characters.

var re = /\S+\sAND\s\S+|\S+/g; 
var str = 'split this but don\'t split this AND this';
var res = str.match(re);
document.write(JSON.stringify(res));

As Alan Moore suggests, the alternation can be unrolled into \S+(?:\sAND\s\S+)*:

  • \S+ - 1 or more non-whitespace characters
  • (?:\sAND\s\S+)* - 0 or more (thus, it is optional) sequences of...
    • \s - one whitespace (add + to match 1 or more)
    • AND - literal AND character sequence
    • \s - one whitespace (add + to match 1 or more)
    • \S+ - one or more non-whitespace symbols.

Upvotes: 3

Lucas Trzesniewski
Lucas Trzesniewski

Reputation: 51330

Since JS doesn't support lookbehinds, you can use the following trick:

  • Match (\sAND\s)|\s
  • Throw away any match where $1 has a value

Here's a short example which replaces the spaces you want with an underscore:

var str = "split this but don't split this AND this";
str = str.replace(/(\sAND\s)|\s/g, function(m, a) {
	return a ? m : "_";
});
document.write(str);

Upvotes: 0

Related Questions