Pascale Abou Abdo
Pascale Abou Abdo

Reputation: 387

str_replace won't replace arabic characters

<?php 
$utf8_string = 'مع السلامة مع السلامة مع السلامة مع السلامة مع السلامة مع السلامة مع السلامة السلامة الرائعة على الطويلة ';
echo $utf8_string;
echo'<br/><br/>';

$patterns = array("على", "مع");
$replacements   = array("", "");

$r_string = str_replace($patterns, $replacements, $utf8_string);

//echo $r_string;
print_r ($r_string);
echo'<br/>';
//$words = preg_split( "/ ( |مع|على) /",$r_string);
$words = explode(" ",$r_string);

$num = count($words);
echo 'There are <strong>'.$num.'</strong> words.';
?>

I have this code to count the number of words in an arabic sentence.however i want to remove some words and count the rest.i tried to use str_replace, but this way is counting the number of words of the original sentence. can anyone help me?

Upvotes: 3

Views: 1650

Answers (3)

Pedro Cordeiro
Pedro Cordeiro

Reputation: 2125

You could use:

$num = count(
    explode(
        " ", 
        str_replace(
            $word, //Word you want to remove from your text.
            "",
            $string //String you want the word to be removed from.
        )
    )
);

Or even:

$num = count(
    explode(
        " ", 
        str_replace(
            array("word1", "word2", [...]), //Words you want to remove from your text.
            "",
            $string //String you want the word to be removed from.
        )
    )
);

EDIT: As pointed out, the above won't work. I tried pinpointing where the error is, and apparently str_replace can't handle arabic characters, even though explode can. PHP is not reliable with non-ascii characters.

What you can do, alternatively, is:

$num = Count(explode(" ", $utf8_string)) - Count(array_intersect(explode(" ", $utf8_string), $patterns))

It should return the value you want.

You could also try writing your own string replacement function, but I would advice against it, seeing you'd have to manually loop through your array and compare each word. Doing so should take longer to run, and make it much more verbose.


Coming here to warn yall that the correct way to handle this is with the mbstring extension (http://php.net/manual/en/book.mbstring.php). Please use this extension, no the ugly hack/workaround above.

Upvotes: 4

user-457786
user-457786

Reputation: 107

Use $num = str_word_count($r_string);

Instead of $num = count($words);

Upvotes: 0

Saic Siquot
Saic Siquot

Reputation: 6513

You need to "remove duplicates spaces" after removing some words and before counting spaces with explode. Trim (or a similar regex) is needed for spaces on front and end of the string

    $r_string = trim(preg_replace('/\s+/u',' ',$r_string));

Upvotes: 1

Related Questions