Reputation: 61
Requirement:
I need to get the searched word in a sentence. So am using RegExp Word Boundary for that.
Note: I need to match WHOLE WORD.
The issue am facing:
When I use RegExp Word Boundary to search a word in a sentence, it's not considering the letters after special character. For example, the below string has only 1 Greek but the RegExp is saying that it has 2.
"The particularly mysteries, which honored the Greek's goddess Demeter Greek."
Code Snippet:
word: string = "Greek";
sentence: string = "The particularly mysteries, which honored the Greek's goddess Demeter Greek.";
isWordThere: boolean = false;
searchedValue: any = [];
constructor() {
const regex = new RegExp('\\b' + this.word + '\\b', 'g');
this.isWordThere = regex.test(this.sentence);
this.searchedValue = this.sentence.match(regex);
console.log(this.searchedValue);
}
What changes I can do to match the whole word? or what else I can do to achieve the requirement?
Upvotes: 1
Views: 137
Reputation: 18611
Use a shorter
RegExp("(?<![-'\\p{L}])" + this.word + "(?![-'\\p{L}0-9])", 'gu');
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
[-'\p{L}0-9] any character of: '-', ''', a Unicode letter, digits
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
foo a word
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
[-'\p{L}0-9] any character of: '-', ''', a Unicode letter, digits
--------------------------------------------------------------------------------
) end of look-ahead
Upvotes: 0
Reputation: 61
With few modifications to @chris-maurer's answer, Am able to achieve my requirement and the correct RegExp is as shown below.
RegExp("(?<![-'0-9a-zÀ-ÿœēčŭ])" + this.word + "(?![-'0-9a-zÀ-ÿœēčŭ])", 'g');
Upvotes: 1
Reputation: 2547
Instead of \b which is a special kind of non-capturing group, you will want a general negative lookahead (and you might as well include the same thing in a negative look behind)
RegExp('(?<![a-z''])' + this.word + '(?![a-z'']', 'gi')
This assumes your 'words' are only alphas. I also changed it to ignore case and match either Greek or greek. This will not match Greek's or Greeks or fenugreek. It will match Greek and greek-specific. If you change the example to search for "all" in the sentence "Y'all should not take all the cookies." it won't match the Y'all but will match the all.
Upvotes: 1