Reputation: 3083
My website allows users to upload files with any name. Some names, of course, will have non-ASCII characters. When a user uploads a file, I save it in a folder with its original name. However, when I try to download it, by accessing its location (for example, files/Tolstoy - How much land does a man need?.pdf
), I get a 404. Is there some way to solve this, so that the files remain with their original name? Via Apache, maybe?
Upvotes: 0
Views: 2095
Reputation: 53513
Um, just use url encoding, known also as percent encoding? that's meant to handle the urls in web. All urls printed to HTML should be url encoded.
For PHP, rawurlencode should be used, as it should be standards-compliant, which urlencode isn't.
Edit: for this issue
PHP encodes "é" as "e%26%23769%3B", instead of "e%CC%81"
e%CC%81
would be UTF-8 for é
. e%26%23769%3B
would be for é
, which is an HTML entity for the same. This means that you're doing either explicit htmlentities() call there before urlencoding, or your server setup does that automatically. It's not strictly needed if proper character sets are in place (only htmlspecialchars call is actually needed), but it shouldn't break anything either.
Some online tools if you want to test these out:
Upvotes: 1
Reputation: 3083
Well, for some reason that I still don't understand, using rawurlencode()
instead of urlencode()
made it work.
However, the character é
(among others, I'm sure) is still being encoded strangely (e%26%23769%3B
instead of simply %C3%A9
). Even stranger is that the links containing it work.
Upvotes: 0
Reputation: 3172
Workaround: convert filenames to ASCII at upload. You will be happy with it.
Upvotes: 0