AFoeee
AFoeee

Reputation: 751

How to test for word boundries, if the pattern starts or ends with punctuation?

I'm having a hard time testing whether a provided string (that likely starts with !) is surrounded by word boundries.

// found in Mozilla's RegExp guide.
function escapeRegExp(str) {
  return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

let msg = "a b c !test1 d e f";
let cmd = "!test1";

let re = new RegExp("\\b" + escapeRegExp(cmd) + "\\b");

console.log(`re: ${re.test(msg)}`);        // re: false

I assume this behaviour occurs, because punctuation itself is counted as a word boundry?

At least escaping the punctuation seems not to solve the problem. (I've tested a modified version of escapeRegExp() that includes !, same result.)

As an workaround I've used a version that splits msg at the white space and compares the elements with cmd. I'm not very happy with this solution as it breaks when cmd itself includes whitespace.

Upvotes: 2

Views: 82

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626920

You can use adaptive dynamic word boundaries:

// found in Mozilla's RegExp guide.
function escapeRegExp(str) {
  return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

let msg = "a b c !test1 d e f";
let cmd = "!test1";

let re = new RegExp("(?!\\B\\w)" + escapeRegExp(cmd) + "(?<!\\w\\B)");
// console.log(re.source);// => (?!\B\w)!test1(?<!\w\B)
console.log(`re: ${re.test(msg)}`); 
// => re: true

The (?!\B\w)!test1(?<!\w\B) regex matches !test1 and

  • (?!\B\w) - checks if the next char is a word char, and if it is, a word boundary is required at the current location, else, the word boundary is not required
  • (?<!\w\B) - checks if the previous char is a word char, and if it is, a word boundary is required at the current location, else, the word boundary is not required.

See some more details about adaptive dynamic word boundaries in my YT video.

Upvotes: 2

Related Questions