Furkanicus
Furkanicus

Reputation: 339

Requests Post UnicodeEncodeError comes up despite encoding with utf-8

I have seen many topics opened on this subject, yet none of them helped me solve the issue. I have a dataset containing text with lots of different characters. Therefore, I encode the text before I make a POST request using Requests library on Python 2.7.13.

My code is the following:

# -*- coding: utf-8 -*-
# encoding=utf8
import sys
reload(sys)
sys.setdefaultencoding('utf8')
import json
import requests
text = """So happy to be together on your birthday! ❤ Thankful for real life. ❤ A post shared by Jessica Chastain (@jessicachastain) on Nov 13, 2016 at 5:22am PST"""
textX = json.dumps({'text': text.encode('utf-8')})
r = requests.post('http://####', data=textX,
                      headers={'Content-Type': 'application/json; charset=UTF-8'})
print(r.text)

The data is sent in JSON format. No matter where I try to encode the text as UTF-8, I'm still getting the following error from Requests.

UnicodeEncodeError: 'latin-1' codec can't encode character '\u2764' in
position 42: Body ('❤') is not valid Latin-1. Use body.encode('utf-8')
if you want to send it encoded in UTF-8.

Edit: Syntax error fixed, but not the cause of the problem

Upvotes: 1

Views: 2789

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 177674

The default for json.dumps is to generate an ASCII-only string, which eliminates encoding problems. The error is not using a Unicode string. Make sure to save the source file in the encoding declared (#coding=utf8):

# coding=utf8
import json
text = u"""So happy to be together on your birthday! ❤ Thankful for real life. ❤ A post shared by Jessica Chastain (@jessicachastain) on Nov 13, 2016 at 5:22am PST"""
textX = json.dumps({u'text': text})

Output:

'{"text": "So happy to be together on your birthday! \\u2764 Thankful for real life. \\u2764 A post shared by Jessica Chastain (@jessicachastain) on Nov 13, 2016 at 5:22am PST"}'

Upvotes: 1

Related Questions