Reputation: 1
I am new at perl and I am looking for some assistance on basically filtering a list of keywords. In short this is to a hash of strings against the same hash of words/phrases. This is to get the lowest common denominator and clean the list up.
For example say the list included the following:
bat
bat boy
bat-boy
bat&boy
bat:boy
bat's
bat-boy's
batman & bat boy
It should only match to the following:
bat boy (because of bat)
batman & bat boy (because of bat)
Regex is obviously the way to go, but I am stuck with the following as I can't use /b (word boundary match) as some of the words contain non word characters -,', &,:, etc.
What would be the best way to write the regex? I am checking $keyx against $keyz
Here is the regex:
if $keyx=~m/\Q$keyz\E/
Any help would be appreciated
Upvotes: 0
Views: 1593
Reputation: 67900
Not quite sure what you're after, but I am guessing you want to match whole words only, no partials, and no words connected with non-letters. A way to accomplish this is to use negative look-around assertions:
use strict;
use warnings;
use v5.10;
for (split /, */, <DATA>) {
say if /(?<![^ ])bat(?![^ ])/;
}
__DATA__
bat, bat boy, bat-boy, bat&boy, bat:boy, bat's, bat-boy's, batman & bat boy
Output:
bat
bat boy
batman & bat boy
So we assert that the characters surrounding the key word is not not space.
Upvotes: 1