SmxCde
SmxCde

Reputation: 5403

PHP - Read non-latin character dir/file name

I have some files and dirs (on Windows, but eventually I will run the same script on Mac and Linux) with non-latin characters in names, for example:

Dir name 01 - Проверка - X.

I am trying to read that name and print it but without success - I always get 01 - ???????? - X instead.

What i have tried:

$items = scandir('c:/myDir/');
$name = $items[2];

echo mb_detect_encoding($name); // Returns "ASCII"
echo '<br>';

echo $n = mb_convert_encoding($name, 'UTF-8', 'Windows-1252');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = mb_convert_encoding($name, 'UTF-8', 'ISO-8859-1');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = mb_convert_encoding($name, 'UTF-8', 'ISO-8859-15');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = mb_convert_encoding($name, 'Windows-1252', 'UTF-8');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = mb_convert_encoding($name, 'ISO-8859-1', 'UTF-8');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = mb_convert_encoding($name, 'ISO-8859-15', 'UTF-8');
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('WINDOWS-1252', 'UTF-8', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('ISO-8859-1',   'UTF-8', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('ISO-8859-15',  'UTF-8', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('UTF-8', 'WINDOWS-1252', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('UTF-8', 'ISO-8859-1', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

echo $n = iconv('UTF-8', 'ISO-8859-15', $name);
echo '<br>';
echo base64_encode($n);
echo '<br><br>';

In the result I always have the same line (I base64-encoded it sou you can see it is the same line)

ASCII
01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

01 - ???????? - X
MDEgLSA/Pz8/Pz8/PyAtIFg=

What can I do about it?

P.S. What I am trying eventually to achieve, I need to compare two directories and when i reed contents of one directory I can not compare it to another because dirs/file names are broken - my script gets name 01 - ???????? - X and obviously can not find such subdir in second (comparing) directory.

Upvotes: 0

Views: 635

Answers (1)

bitten
bitten

Reputation: 2543

Actually my previous answer was not right. The problem is that PHP5 does not support UTF-8 for file operations.

A work around would be to use something like WFIO, which exposes it's own protocol for file streams and allows PHP to handle UTF-8 characters in file operations. You can see in the README that the syntax would be:

scandir("wfio://directory")

Good luck!

Upvotes: 2

Related Questions