Reputation: 444
I am creating some download links. My problem is that if "MY_FILE_NAME.doc" file is saved using English characters, then is downloading. If I save it using Greek characters then I cannot download... I am using utf-8 encoding.
(I don't know if it matters, but I display Greek characters all over my page with out any problem)
Here is my link:
<!DOCTYPE>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>CFS Portal</title>
</head>
<body>
<div>
<span class="txt">
<a href="scripts/download.php?file=MY_FILE_NAME.doc" class="dload">Ιατρικό</a>
</span></div>
</body>
and my download.php file:
<?php
// block any attempt to the filesystem
if (isset($_GET['file']) && basename($_GET['file']) == $_GET['file']) {
$filename = $_GET['file'];
} else {
$filename = NULL;
}
$err = '<p class="err_msg">
STOP / ΣΤΑΜΑΤΗΣΤΕ
</p>';
if (!$filename) {
// if variable $filename is NULL or false display the message
echo $err;
} else {
// define the path to your download folder plus assign the file name
$path = 'downloads/'.$filename;
// check that file exists and is readable
if (file_exists($path) && is_readable($path)) {
// get the file size and send the http headers
$size = filesize($path);
header('Content-Type: application/octet-stream;');
header('Content-Length: '.$size);
header('Content-Disposition: attachment; filename='.$filename);
header('Content-Transfer-Encoding: binary');
// open the file in binary read-only mode
// display the error messages if the file can´t be opened
$file = @ fopen($path, 'rb');
if ($file) {
// stream the file and exit the script when complete
fpassthru($file);
exit;
} else {
echo $err;
}
} else {
echo $err;
}
}
?>
Upvotes: 0
Views: 3104
Reputation: 21
There might be several problems in your script:
1. File system encoding issue
To actually access the file on the file system you must properly encode the file name. The LC_CTYPE locale parameter tells which is the expected encoding of the file names on disk.
Under Unix/Linux or similar operating system, that parameter might evaluate to something like "en_US.UTF-8" that means the encoding of the file names are UTF-8 so no conversion is required in your script.
Under a Windows server UTF-8 is not allowed, and typically LC_CTYPE evaluates to something like "language_country.codepage" where "codepage" is the number of the currently active code page, for example 1252 (western countries) or 1253 (greek). Here http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/ is the list of the available code pages under Windows. Then, under Windows, you will need to convert the UTF-8 file name into something else in order to read it from the disk. More details are available in my reply to the PHP bug no. 47096 available at: https://bugs.php.net/bug.php?id=47096
2. HTTP file name encoding issue
In my experience, different browsers support different ways to encode non-ASCII file names, and behave differently when special or invalid characters are present (for example, under Windows the question mark '?' is not allowed in file names, and browsers either simply drop that char or completely replace the whole file name with another generated randomly). Anyway, the following chunk of code work with most of the browsers:
$file_name = "Caffé Brillì.pdf"; # file name, UTF-8 encoded $file_mime = "application/pdf"; # MIME type $file_path = "absolute/or/relative/path/file"; header("Content-Type: $file_mime"); header("Content-Length: " . filesize($file_path)); $agent = $_SERVER["HTTP_USER_AGENT"]; if( is_int(strpos($agent, "MSIE")) ){ # Remove reserved chars: :\/*?"| $fn = preg_replace('/[:\\x5c\\/*?"|]/', '_', $file_name); # Non-standard URL encoding: header("Content-Disposition: attachment; filename=" . rawurlencode($fn)); } else if( is_int(strpos($agent, "Gecko")) ){ # RFC 2231: header("Content-Disposition: attachment; filename*=UTF-8''" . rawurlencode($file_name)); } else if( is_int(strpos($agent, "Opera")) ) { # Remove reserved chars: :\/*{? $fn = preg_replace('/[:\\x5c\\/{?]/', '_', $file_name); # RFC 2231: header("Content-Disposition: attachment; filename*=UTF-8''" . rawurlencode($fn)); } else { # RFC 2616 ASCII-only encoding: $fn = mb_convert_encoding($file_name, "US-ASCII", "UTF-8"); $fn = (string) str_replace("\\", "\\\\", $fn); $fn = (string) str_replace("\"", "\\\"", $fn); header("Content-Disposition: attachment; filename=\"$fn\""); } readfile($file_path);
Hope this may help.
Upvotes: 2
Reputation: 407
I think the problem lies with the encoding [between your script and the file system].
Verify the encoding you are using on the underlying file system as that is how you are accessing the file (ie is the file system is using iso-8859-1 [or similar] and you are sending the variable as UTF-8 - then there is a issue.
Or it might be as simple as that you should do a urldecode on the var you are getting from the get (you might also need to encode it further after this to reach the encoding for the filesystem).
$filename = urldecode($_GET['file']);
Upvotes: 0