Reputation: 886
I am looking for a way to match the exact words entered in Regex.
Unfortunately, boundary won't work because the search term can have multiple words.
I came up with this regex (?:^|[\\s])(<word>)(?:$|[\\s!?])
and it works perfectly until there are multiple <word>
s one to another.
Regex: (?:^|[\\s])(won)(?:$|[\\s!?])
Text:
We won won won
In this text, it will only match every second word. I get this is because of requires a space but that space is already included with the previous word.
There are more difficulties with this.
It shouldn't match contractions, such as won
shouldn't match won't
. This also applies for hyphenated words won-me
.
To make this simple I made unit tests for testing all the cases:
https://regex101.com/r/9Mj0UC/4/tests
Note: I can't test in unit tests if it matches every single one or every second one. Therefore please simple look at test string panel.
Can someone provide a solution for this Regex madness?
It needs to be written in Regex (and JS compatible)
Upvotes: 0
Views: 170
Reputation: 13772
What about this way (without lookbehind):
/(?:^|(?!['-])[^]\b)won(?!\B|['-])/i
Upvotes: 1
Reputation: 43169
You could use the following expression:
(\w+-)?won(?![-'])
Additionally, you need to check if the first group is empty programmatically, see a demo on regex101.com.
For engines supporting lookbehinds (Chrome
and the like), you could even use
(?<!\w-)won(?![-'])
See a demo on regex101.com as well.
JS
like so:
let strings = ["I won't win", "won", "I won", "You won", "We won, finally", "Have we won?", "We won!", "We non-won match", "He won-me"];
let rx = /(\w+-)?won(?![-'])/
strings.forEach(function(item) {
m = rx.exec(item);
if ((m != null) && (typeof(m[1]) == 'undefined'))
console.log(item);
});
Upvotes: 1
Reputation: 906
Use positive lookbehind and positive lookahead for spaces. below is the regex.
//check if there is are white spaces before and after the word
let regex = /(?<=\s)won(?=\s)/g;
console.log("We won won won't won no-won".match(regex));
Upvotes: 0
Reputation: 18619
Simply use \b
to match a word boundary:
console.log("We won won won no-won won-with-hyphen".match(/(?<!-)\b(won)\b(?!-)/g))
Upvotes: 0