Reputation: 2758
I have a following string as a Filename
$string = 'recyclage plétre francin.jpg';
and tried with following code
echo preg_replace('/[^a-z0-9|^.]/i', '_', iconv("UTF-8","ISO-8859-1//TRANSLIT",$string));
as there is a special (non-ascii) character in filename it creates junk character while working with file uploading using PHP.
What I want is that replace any unicode (non-ascii) character with specific Ascii character.
I want to keep all supported Ascii characters and remove non-ascii characters. I also want to keep /
or \
slashes because of directory separators in filename where a root path will be given.
Edit: (below is not solved)
I am having a issue with recyclage plƒtre francin.JPG
please the f
character which displays output like recyclage pl
and it had truncated .JPG
. Actually file name was recyclage plâtre francin
and when I was debugging it has shown recyclage plƒtre francin.JPG
and rest is written just after that. Any Idea?
When I am trying to convert tri et recyclage du plâtre
but when at the reading it shows tri et recyclage du plâtre
and after conversion it shows tri et recyclage du pl^atre
.
Any help will be appreciated.
Upvotes: 3
Views: 4933
Reputation: 2820
You can use simple one that will remove all chars except a-z, 0-9 or whitespace.
// Remove all characters that are not the separator, a-z, 0-9, or whitespace
$string = preg_replace('![^'.preg_quote('-').'a-z0-_9\s]+!', '', strtolower($string));
// Replace all separator characters and whitespace by a single separator
$string = preg_replace('!['.preg_quote('-').'\s]+!u', '-', $string);
Upvotes: 0
Reputation: 2758
Here is a solution to my question. Finally I could able to see the conversion. Some Unicode characters are replaced with some Ascii characters. But after all everything is now working fine.
function toASCII($str)
{
$accent = 'ŠŒŽšœžŸ¥µÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûýýþÿŔŕƒ';
$noaccent = 'SOZsozYYuaaaaaaaceeeeiiiidnoooooouuuuybsaaaaaaaceeeeiiiidnoooooouuuyybyRra';
$string = strtr(utf8_decode($string),utf8_decode($accent),$noaccent);
return strtr($string, $accent, $noaccent);
}
Upvotes: 3
Reputation: 12537
If you use the TRANSLIT
modifier, it replaces all characters which can't be displayed in the target encoding. Since é can be represented in ISO-8859-1 it is encoded as ANSI-Code 0xE9
.
I guess you want something like that:
$string = 'recyclage plétre francin.jpg';
echo iconv("UTF-8","ASCII//TRANSLIT",$string);
The result with that iconv
-call is: recyclage pletre francin.jpg
Upvotes: 6
Reputation: 212
Check this code
<?php
$string = 'recyclage plétre francin.jpg';
$str = preg_replace('/[^\x20-\x7E]/', '', $string);
echo $str;
?>
Upvotes: 1