imlokesh
imlokesh

Reputation: 2739

PHP urlencode for chinese characters

I'm creating a php application that involves sending chinese characters as url parameters.

I have to send query like :

http://xyz.com/?q=新

But the script at xyz.com won't automatically encode the chinese character. So, I need to explicitly send an encoded string as the paramter. It becomes:

http://xyz.com/?q=%E6%96%B0

The problem is, PHP won't encode the chinese character properly.

I've tried urlencode() and rawurlencode(). But they give %D0%C2 (doesn't work for my purpose) instead of %E6%96%B0 (works well with xyz.com) as the output.

I'm using this website to create the latter encoded string.

I've also defined header('Content-Type: text/html; charset=gb2312'); to display chinese characters properly.

Is there anything I can do to urlencode the chinese character properly?

Thanks!

PS: I'm a relatively new programmer and don't understand chinese.

Upvotes: 4

Views: 7073

Answers (2)

Lars Knickrehm
Lars Knickrehm

Reputation: 747

In order to reproduce your problem I created a simple PHP file:

<?php
var_dump(urlencode('新'));
?>

First I used UTF8 encoding and got %E6%96%B0. Afterwards I changed to GB2312 and got %D0%C2.

At http://meyerweb.com/eric/tools/dencoder/ they seem to use JavaScript, that's UTF8 capable and therefore returns %E6%96%B0, too.

PS: When changing from GB2312 to UTF8 some editors might break code some internationalized code. So please make sure to have a copy of your file before converting!

Upvotes: 1

MajorSanchez
MajorSanchez

Reputation: 156

You're URLencoding using the charset you specify in your header. %D0%C2 is 新 in gb2312; %E6%96%B0 is 新 in UTF-8. Switch your charset over to UTF-8 and you should fix this issue and still be able to display Simplified Chinese Han.

Upvotes: 4

Related Questions