Dino
Dino

Reputation: 806

Unicode character is changing on str_replace

Im creating a website with seo friendly url. But when im using str_replace() to replace spaces with -, its changing some unicode charaters.

function create_slug($string){
   $slug = str_replace(' ','-', $string);
   return strtolower($slug);
}

When create_slug('Google എഴുത്ത് ഉപകരണങ്ങളുടെ Chrome വിപുലീകരണം'); is called i have to get the output as

google-എഴുത്ത്-ഉപകരണങ്ങളുടെ-chrome-വിപുലീകരണം

But sometime im getting output as

google-ഞഴുത്ത്-ഉപകരണങ്ങളു���െ-chrome-വിപുലീകരണം

what am i doing wrong ?

Upvotes: 1

Views: 827

Answers (1)

Federkun
Federkun

Reputation: 36934

If your server's locale character set setting doesn't support UTF-8, strtolower could mistake some valid multibytes sequences as something else. The result would be a corrupt UTF-8 string.

Use mb_strtolower instead.

mb_strtolower($slug, 'UTF-8');

If you want that function like strtolower and strtoupper to only considering characters in the ASCII range, you can overriding your server's locale settings with:

setlocale(LC_CTYPE, 'C');

As ASCII is a subset of UTF-8, now the string can be changed with strtolower without problems.

Upvotes: 2

Related Questions