Admin File3
Admin File3

Reputation: 117

Replace whole words in a string without replacing replacements

I need to replace each word only once, and vice versa. To do that, I used a code that does not work, and I can not find an answer to my question.

Input:

hello w1 w2 w12 new1 new12 new2

Expected output:

hello new1 new2 w12 w1 new12 w2

I need words / phrases into the text to replace.

w1 replace with new1

w12 unchanged

w2 replace with new2

new1 replace with w1

new12 unchanged

new2 replace with w2

Of course my text in Persian

My code is:

$string="hello w1 w2 w12 new1 new12 new2";

$fword= array("w1","w2");
$lword= array("new1","new2");

$cnt=0;
$string=str_replace($fword,$lword,$string,$cnt);
$string=str_replace($lword,$fword,$string,$cnt);
echo "<h2>Change in string: $cnt <br> New String: $string </h2>";

But it is wrong

I also use this code:

$string="hello w1 w2 w12 new1 new12 new2";

$fword= array("w1","w2","new1","new2");
$lword= array("new1","new2","w1","w2");

$cnt=0;
$string=str_replace($fword,$lword,$string,$cnt);
echo "<h2>Change in string: $cnt <br> New String: $string </h2>";

Upvotes: 0

Views: 1426

Answers (4)

mickmackusa
mickmackusa

Reputation: 47894

To ensure that your script is only replacing whole words, use word boundaries around a piped collection of all search strings. In the callback of preg_replace_callback(), search the mapping array for the appropriate replacement value for the matched substring.

This technique will not replace replacements because the input string is only traversed once.

Code: (Demo)

$string = 'hello w1 w2 w12 new1 new12 new2';

$map = [
    'w1' => 'new1',
    'w2' => 'new2',
    'new1' => 'w1',
    'new2' => 'w2',
];

$subpattern = implode('|', array_map('preg_quote', array_keys($map)));

echo preg_replace_callback(
         '#\b(?:' . $subpattern . ')\b#u',
         fn($m) => $map[$m[0]] ?? $m[0],
         $string
     );
// hello new1 new2 w12 w1 new12 w2

If word boundaries are not working for your real Persian content, then you'll need to offer better sample data so that a tailored pattern can be crafted.

Upvotes: 0

ghostprgmr
ghostprgmr

Reputation: 488

You can tokenize your string with strtok.

Then check the tokens in a reverse loop, and if the truncated token is in the allowed words list, replace it (you can have a mapping array like ["W1" => "E1", ...]). If such a word was already replaced, just go further.

Upvotes: 2

Sahil Gulati
Sahil Gulati

Reputation: 15141

I know this is lengthy one, but i have tried myself best to get it done.

PHP code demo

<?php
ini_set("display_errors", 1);

echo $string="hello w1 w2 w12 new1 new12 new2";
$fword= array("w1","w2","new1","new2");
$lword= array("new1","new2","w1","w2");

//---------Working----------->
$replacement=  array_combine($fword, $lword);
$totrimOffsets=array();
$indexes=findIndexes($fword);
$string=preg_replace("/\~\~{1,}/", "~~", $string);
$newString=replace();
//--------------------------->
echo PHP_EOL;
echo $newString;
function findIndexes($array)
{
    global $totrimOffsets,$string,$replacement;
    $indexes=array();
    foreach($array as $element)
    {
        preg_match_all("/\b$element\b/i", $string,$matches,PREG_OFFSET_CAPTURE);
        if(isset($matches[0]) && count($matches[0])>0)
        {
            foreach($matches[0] as $matchData)
            {
                $indexes[$element][]=array("element"=>$element,"offset"=>$matchData[1],"length"=>  strlen($element));
                $totrimOffsets[]=$matchData[1].",".($matchData[1]+strlen($element)-1).",".$element.",".$replacement[$element];
                $string=  substr_replace($string, getSpecialChars(strlen($element)), $matchData[1],strlen($element));
            }
        }
    }
    sort($totrimOffsets,SORT_NUMERIC);
    return $indexes;
}
function replace()
{
    global $string,$totrimOffsets,$indexes;
    $stringArray=explode("~~",$string);
    $newString="";
    foreach($stringArray as $key => $value)
    {
        $newString.=$value;
        if(isset($totrimOffsets[$key]))
        {
            $newString.=explode(",",$totrimOffsets[$key])[3];
        }
    }
    return $newString;
}
function getSpecialChars($length)
{
    $dummyString="";
    for($x=0;$x<$length;$x++)
    {
        $dummyString.="~";
    }
    return $dummyString;
}

Upvotes: 2

Torge
Torge

Reputation: 2284

You should use preg_replace. Tell it to check for space or string beginning or end (^|.*\s)and (\s.*|$) to avaid replacing partial matches.

$string="hello w1 w2 w12 new1 new12 new2";

$replacements = array(
    "w1" => "new1",
    "w2" => "new2",
    "new1" => "w1",
    "new2" => "w2"
);

foreach ($replacements as $from=>$to) {
    $string = preg_replace(
                '/(^|.*\s)'.preg_quote($from).'(\s.*|$)/',
                '\1'.preg_quote($to).'\2', 
                $string);
}

echo $string;

if only the first occurrenc should be replaced you can also give this function a limit of 1 as a the 4th parameter.

Update: Detailed explanation

(^|.*\s): First match group: String begins, or start of string followed by a space.

preg_quote($from): Your string to replace. It is quoted to support allkind of characters. preg_quote() will escape all characters to not interfere with regular expression control codes. Also takes care of Unicode characters.

(\s.*|$): Second match group: End of string or space followed by rest of the string.

'\1'.preg_quote($to).'\2': The replacement. First group + new string + second group.

Update 2:

Got rid of unneccessary group in code and added escaping to be more general applyable for all kinds of inputs.

Upvotes: 2

Related Questions