Reputation: 29
I'm busy trying to learn Drupal/PHP and have an issue I hope someone can help with. I've read other and similar posts to this one, but trying out the solutions has not worked out, hence asking a new question.
I'm trying to adapt an existing Drupal module (wordfilter for D7, dev release) which will replace any instance of profanity with an alternative. So as an example, if a given string contained the string 'word' to be replaced, then I need to match the whole word, not just the offending characters, so
'wording worder got worded. word!'
needs to become
'<deleted> <deleted> got <deleted>. <deleted>!'
and not
'<deleted>ing <deleted>er got <deleted>ed. <deleted>!'.
The code I have so far has a couple of issues. Firstly, it only replaces the exact match, not the entire word. Secondly, I have an issue with delimiters and escape characters. I've marked where I think the issues are with **issue 1 and **issue 2. If I'm wrong, please let me know.
The error thrown by issue 2 is
Warning: preg_replace(): Unknown modifier '$'
which I think is to do with certain characters not being escaped correctly. I tried wrapping the $pattern variable with escapers so it read
$text = preg_replace('/' . $pattern . '/', "\${1}" . $replacement . "\${2}", $text);
but no luck. The regex did not then match anything. The issue may be with the regex itself, but I'm pretty sure it's correct. The pattern I'm using is
$pattern = '^(.*?(\B'word'\B)[^$]*)$';
but 'word' being wrapped in a preq_quote call.
So there you go. There's potentially a whole raft of issues for you all to rip to shreds. I'm sure you can all smell blood :-) If I need to rewrite the whole function, so be it. If it's a quick fix, so much the better. If I've missed anything out, or you want more info, let me know and I'll edit the question to contain it. Any help would be greatly appreciated, like I say, I took this approach as a learning exercise so all (constructive) criticism is welcomed.
/**
* hook_filter process operation callback.
*/
function wordfilter_filter_process($text) {
//dpm($text);
$text = ' ' . $text . ' ';
$list = _wordfilter_list();
$utf8 = variable_get('wordfilter_use_utf8_flag', FALSE);
$case_sensitive = variable_get('wordfilter_process_case_sensitive', FALSE);
$default_replacement = variable_get('wordfilter_default_replacement', '[filtered word]');
//dpm($list);
foreach ($list as $word) {
// Prevent mysterious empty value from blowing away the node title.
if (!empty($word->words)) {
$replacement = ($word->replacement) ? $word->replacement : $default_replacement;
if ($replacement == '<none>') {
$replacement = '';
}
if ($word->standalone) {
$pattern = '/(\W)' . preg_quote($word->words, '/') . '(\W)/';
}
else { //**issue 1
//$pattern = '/' . preg_quote($word->words, '/') . '/';
$pattern = '^(.*?(\B' . preg_quote($word->words, '/') . '\B)[^$]*)$';
}
if (!$case_sensitive) {
$pattern .= 'i';
}
if ($utf8) {
$pattern .= 'u';
}
$split_text = preg_split('/(<[^>]*>)/i', drupal_substr($text, 1, -1), -1, PREG_SPLIT_DELIM_CAPTURE);
$split_text = array_values(array_filter($split_text));
if (count($split_text) > 1) {
$new_string = '';
foreach ($split_text as $part) {
if (!preg_match('/^</', $part)) {
//dpm($part);
$new_string .= preg_replace($pattern, "\${1}" . $replacement . "\${2}", $part);
//$new_string .= preg_replace($pattern, $replacement, $part);
}
else {
$new_string .= $part;
}
}
}
else { //**issue 2
$text = preg_replace($pattern, "\${1}" . $replacement . "\${2}", $text);
//$text = preg_replace($pattern, $replacement, $text);
}
}
}
$text = drupal_substr($text, 1, -1);
return $text;
}
Upvotes: 1
Views: 801
Reputation: 67988
\bword\w*
You can simply use this.See demo.
https://regex101.com/r/lR1eC9/7
Upvotes: 1