atomicjeep
atomicjeep

Reputation: 528

Regexp to ignore hyphenated words during common word removal pattern

I've got this regular expression which removes common words($commonWords) from a string($input) an I would like to tweak it so that it ignores hyphenated words as these sometimes contain common words.

return preg_replace('/\b('.implode('|',$commonWords).')\b/i','',$input);

thanks

Upvotes: 0

Views: 879

Answers (3)

Kathir
Kathir

Reputation: 21

return preg_replace('/(?<![-\'"])\b('.implode('|',$commonWords).')\b(?![-'"])i','',$input);

The above will work if we have more symbols to be escaped.

Upvotes: 0

Cedric
Cedric

Reputation: 5303

preg_replace('/\b('.implode('|',$commonWords).'|\w-\w)\b/i','',$input);

\w Any word character (letter, number, underscore) it'll remove all all the commonwords, AND all the words who've a hyphene.

Upvotes: 0

Tim Pietzcker
Tim Pietzcker

Reputation: 336208

Try

return preg_replace('/(?<!-)\b('.implode('|',$commonWords).')\b(?!-)/i','',$input);

This adds negative lookaround expressions to the start and end of the regex so that a match is only allowed if there is no dash before or after the match.

Upvotes: 2

Related Questions