AnswerSeeker
AnswerSeeker

Reputation: 53

Problem when zipping files with special characters in PHP+Apache - encoding issue

once again I am experiencing problems with encodings when trying to zip files in a PHP application.

Here is the deal, whenever the file name contains special characters, i.e.: 'eñeìá.html', I am not able to zip it right.. the result of zipping it with the php function AddFile is 'e+¦e+¼+í.html'

The problematic line is the following:

$zip->addFile($file_to_add_path, $file_to_add->getFilename());

I have already tried using iconv, utf8_decode/encode, etc. but no luck yet. The closer I got was with the example above when using htmlentities and then decoding them..

I am running the application in Xampp in Win XP OS.. which may be the root of the problem.

Funny thing is, when I unzip the file named before in the application, its name is fine, however, when I download the zipped file and open it... bleh..

Either way, thanks a lot in advance to anyone who could help me out or guide me a bit with this. Should there be more information needed, please do ask me for it.

Best regards

Upvotes: 3

Views: 6828

Answers (6)

Thanh Trung
Thanh Trung

Reputation: 3804

In my case, ZipArchive require file encoding IBM850.

You need to convert file name to IBM850 when zipping and back to UTF8 / ISO-8859-1 / CP1252 when extracting from zip.

//zipping
$relativePath = iconv('UTF-8', 'IBM850', $relativePath);
$zip->addFile($filePath, $relativePath);

//extracting
$relativePath = iconv('IBM850', 'UTF-8', $zip->getNameIndex($i));
$zip->renameIndex($i, $relativePath);
$zip->extractTo($destination, $relativePath);

Upvotes: 0

lisandro
lisandro

Reputation: 496

Using

$clean_filename = iconv("ISO-8859-1", "CP860", $filename);

Solved my problem with Portuguese file names, Change CP860 according to the code that best catches your special characters.

https://en.wikipedia.org/wiki/Code_page

Upvotes: 0

HaSh
HaSh

Reputation: 41

I had the same problem with Central European chars, solved using iconv("UTF-8", "CP852", $string); where CP852 is an old DOS encoding for Central Europe. So it might help to use an according encoding for your language (I think it is determined by an internal ZIP algorithm configuration, or whatever).

Upvotes: 4

Matthew Purdon
Matthew Purdon

Reputation: 773

When using iconv, did you try to play with any of the out_charset append options? Using the following code I am able to create an archive with the file "los niños.txt" added as "los nios.txt"

<?php
$archivePath = realpath('.\test.zip');
$archive = new ZipArchive;
$opened = $archive->open($archivePath, ZIPARCHIVE::OVERWRITE);
if ($opened === true) {
    $directory = new DirectoryIterator('.');
    foreach($directory as $fileInfo) {
        if ($fileInfo->isDot()) {
            continue;
        }

        if (preg_match('#.*\.txt$#', $fileInfo->getBasename())) {
            $cleanFilename = iconv("UTF-8", "ISO-8859-1//IGNORE", $fileInfo->getFilename());
            $archive->addFile($fileInfo->getRealPath(), $cleanFilename);
        }
    }
    $closed = $archive->close();
    if (!$closed) {
        echo "Could not create ZIP file<br/>";
    }
} else {
    echo "Could not open archive because of code {$opened}<br/>";
}

Basically if iconv is not able to find a suitable substitution for the UTF-8 character, it just drops it out and leaves the rest of the file name intact.

Upvotes: 0

Citizen
Citizen

Reputation: 12927

Prior to zipping the file, try url encoding the file name:

http://php.net/manual/en/function.urlencode.php

Upvotes: 2

profitphp
profitphp

Reputation: 8334

Have you tried using a different client to open it, like winRAR or something? It is probably a difference in versions. Whatever you're creating it with likely supports the unicode chars, while the client you're opening it with does not.

Upvotes: 0

Related Questions