Reputation: 77
I'm using Flask to build a web server handling some Chinese requests via POST method. Originally, I'm thinking of using request.form['body']
to get the content, however, because of the client-side encoding is in BIG5
, somehow returned values from Flask.request.form
is always decoded using UTF-8
, so i have to use request.get_data()
to retrieve raw data from the request and decode it myself.
But the weird thing is that when the enctype = multipart/form-data
everything is fine that i can use request.get_data().decode('big5')
to get the correct characters, but when i don't specified enctype which will use application/x-www-form-urlencoded
by default, the returned value like below:
Result 1.
%B6W%C3%D9%A4u%B5%7B%A6%B3%AD%AD%A4%BD%A5q
which is not 'BIG5' encoded, the original text should look like below:
Result 2.
超贊工程有限公司
'BIG5' encoded one should like below:
Result 3.
xb6W\xc3\xd9\xa4u\xb5{\xa6\xb3\xad\xad\xa4\xbd\xa5q
My question is how can i properly decode form data from Result1 to Result2 when using application/x-www-form-urlencoded
?
Code and result if content-type eqauls to application/x-www-form-urlencoded
as below:
Code and result if content-type eqauls to multipart/form-data
as below:
Upvotes: 1
Views: 1710
Reputation: 1186
You're getting an URL-encoded string. Use urllib
to decode it:
import urllib
data = '%B6W%C3%D9%A4u%B5%7B%A6%B3%AD%AD%A4%BD%A5q'
print(urllib.parse.unquote(data, encoding='big5'))
This prints 超贊工程有限公司
, which looks like your expected output.
Upvotes: 2