Reputation: 387
I'm currently using the following regex to remove small words ( < 4 chars) from a string.
$dirty = "I welcome you to San Diego";
$clean = preg_replace("/\b[^\s]{1,3}\b/", "", $dirty);
So, this would result in "Welcome Diego";
However, i now need to ignore certain words from being replaced, for instance:
$ignore = array("San", "you");
would result in "welcome you San Diego"
Upvotes: 3
Views: 2424
Reputation: 48887
I recommend using a callback (preg_replace_callback) as it allows a more maintainable solution if you have to scale to a large number of words:
echo preg_replace_callback(
'/\b[^\s]{1,3}\b/',
create_function(
'$matches',
'$ignore = array("San", "you");
if (in_array($matches[0], $ignore)) {
return $matches[0];
} else {
return \'\';
}'
),
"I welcome you to San Diego"
);
// output: welcome you San Diego
If you're using PHP 5.3 or greater, you could employ an anonymous function rather than calling create_function.
Upvotes: 5
Reputation: 145482
You can embed your ignore list using a (?!..)
negative assertion:
preg_replace("/\b(?!San|you|not)\w{1,3}\b/", "", ...
Also I would just use \w
instead of [^\s]
so it really only matches words.
Upvotes: 9