Reputation: 115
I'm sanitizing a string by removing strings found in this array:
$regex = array("subida", " de"," do", " da", "em", " na", " no", "blitz");
And this is the str_replace()
that I'm using:
for ($i = 0;$i < 8; $i++){
$twit = str_replace($regex[$i], '', $twit);
}
How do I make it only remove a word if it's exactly the word in string, I mean, I have the following phrase:
#blitz na subida do alfabarra blitz
it will return me:
# alfabarra
I don't want the first blitz
to be removed because it is preceded by a hash (#
), I want it to output:
#blitz alfabarra
Upvotes: 3
Views: 245
Reputation: 47863
To sanitize your Portuguese phrase as desired using your set of words, each word will need to be programmatically prepared for the regex engine.
If the word starts with a space (an entirely insignificant word) then there is no need for a leafing word boundary, simply escape any special characters and append a word boundary.
If the word does not start with a space, then:
Then escape special characters and append a word boundary.
Code: (Demo)
$subs = array_map(
fn($v) => (str_starts_with($v, ' ') ? '' : ' ?\b(?<!#)')
. preg_quote($v) . '\b',
$subpatterns
);
echo preg_replace(
'~' . implode('|', $subs) . '~u',
'',
$str
);
// #blitz alfabarra
I have added the u
pattern modifier in case multibyte characters come into play. Your sample text doesn't indicate the possibility of uppercase characters.
Upvotes: 0
Reputation: 11
Try this:
for($i=0; $i<$regex('count'); $i++){
foreach($regex[$i] as $key) {
if ( is_string($key) ) {
$twit = str_replace($regex[$i],'', $twit);
}
}
}
Upvotes: 0
Reputation: 42458
After failing to come up with a catch-all regex solution, the following may be useful:
$words = array("subida", " de", " do", " da", "em", " na", " no", "blitz");
$words = array_map('trim', $words);
$str = '#blitz *blitz ablitz na subida do alfabarra blitz# blitz blitza';
$str_words = explode(' ', $str);
$str_words = array_diff($str_words, $words);
$str = implode(' ', $str_words);
var_dump($str);
Gets round a few complications with word boundaries in regex-based solutions.
Upvotes: 1
Reputation: 490143
This assumes that none of your strings have /
in them. If so, run preg_quote()
explicitly with /
as the second argument.
It also assumes you want to match the words, so I trimmed each word.
$words = array("subida", " de"," do", " da", "em", " na", " no", "blitz");
$words = array_map('trim', $words);
$words = array_map('preg_quote', $words);
$str = preg_replace('/\b[^#](?:' . implode('|', $words) . ')\b/', '', $str);
Upvotes: 5