hoju
hoju

Reputation: 29452

how to open a URL with non utf-8 arguments

Using Python I need to transfer non utf-8 encoded data (specifically shift-jis) to a URL via the query string. How should I transfer the data? Quote it? Encode in utf-8?

Thanks

Upvotes: 0

Views: 1508

Answers (3)

bobince
bobince

Reputation: 536379

Query string parameters are byte-based. Whilst IRI-to-URI and typed non-ASCII characters will typically use UTF-8, there is nothing forcing you to send or receive your own parameters in that encoding.

So for Shift-JIS (actually typically cp932, the Windows extension of that encoding):

foo= u'\u65E5\u672C\u8A9E' # 日本語
url= 'http://www.example.jp/something?foo='+urllib.quote(foo.encode('cp932'))

In Python 3 you do it in the quote function itself:

foo= '\u65E5\u672C\u8A9E'
url= 'http://www.example.jp/something?foo='+urllib.parse.quote(foo, encoding= 'cp932')

Upvotes: 4

mkluwe
mkluwe

Reputation: 4061

By the »query string« you mean HTTP GET like in http:/{URL}?data=XYZ?

You have encoding what ever data you have via base64.b64encode using -_ as alternative character to be URL safe as an option. See here.

Upvotes: 0

Tuure Laurinolli
Tuure Laurinolli

Reputation: 4087

I don't know what unicode has to do with this, since the query string is a string of bytes. You can use the quoting functions in urllib to quote plain strings so that they can be passed within query strings.

Upvotes: 1

Related Questions