Reputation: 9722
I'm currently finishing of an ecard system that allows users to send cards to other people by mail.
The mail also contains a link to view the same card in the browser, this link is basically generated by encoding the text with base64_encode()
'http://www.test.com/ecards?card=' . base64_encode('your text'); // like this
This works fine for english text, but once I enter some Chinese and visit the link, the characters are all messed up
汉��N�B��Y][ۘ[�[�\�N�9�(�*���B�[�Z[��0��q��B��[\Y�YY�[�\�N�9cc�+�N�B��Y][ۘ[�[�\�N�:#��*���B��[�\�N�9.+y���
I has nothing to do with my charset, it's set to UTF-8, I even printed the same Chinese text and it's showing up perfectly.
So I'm wondering if base64_encode()
and base64_decode()
might have something to do with this.
// Doesn't work
echo base64_decode($body);
// Chinese characters show up fine!!!!
echo 'simplified Chinese: 汉语; <br />';
echo 'traditional Chinese: 漢語; <br />';
echo 'Pinyin: Hànyǔ; <br />';
echo 'simplified Chinese: 华语; <br />';
echo 'traditional Chinese: 華語; <br />';
echo 'Chinese: 中文; <br />';
EDIT: When I try outputting $_GET[] when using an url like http://www.test.com/ecards?card=中文, it works fine.
So it's clearly the base64_encode
or base64_decode
that can't handle Chinese characters.
Upvotes: 1
Views: 6344
Reputation: 437574
The base64_
functions do not operate on characters, they operate on bytes. They will happily convert anything you pass in without error.
Your problem here is that the encoding of the characters you are using as input does not match the encoding of the page where they are displayed after decoding. Where does "your text" come from? If it's from a form submission, you need to make sure that the page where the form appears is displayed using the same encoding as the "view card" page, or that the form has an accept-charset
attribute matching the encoding of the "view card" page.
Upvotes: 5