anekix
anekix

Reputation: 2563

understanding python data encoding issue

i am getting some data from facebook by using requests.This is the sample data.

response = {'message': 'I have recommended your  name to all my family n friend
s. Thankyou!!!!\\ud83d\\ude0a\\ud83d\\ude0a\\ud83e\\udd17\\ud83e\\udd17\\ud83d\\udc4c\\ud83d\\udc4c\\ud83d\\udc4d\\ud83d\\udc4
}

The last few characters are emojis. but when i need to save it in database.

so i try to convert it into a dictionay first so that i can add keys & manipulate data:

response = json.loads(response.content, encoding='utf-8')

but when i do print(response) i get something like

       {
'message': 'I have recommended your  name to all my family n friend
        s. Thankyou!!!!__ __ __ __ __ __ __
        }

and from db i get this error:

Incorrect string value: '\xF0\x9F\x98\x8A\xF0\x9F...'

what is the encoding i got ? how do i transform it so that i can store it in databse(mysql)

Upvotes: 0

Views: 363

Answers (2)

Raman Saini
Raman Saini

Reputation: 435

you can either use the unicodedata:

title = u"Klüft skräms inför på fédéral électoral große"
import unicodedata
unicodedata.normalize('NFKD', title).encode('ascii','ignore')
'Kluft skrams infor pa federal electoral groe'

or just replace the characters with your own specified characters for later use as emojis:

>>> a=u"aaaàçççñññ"
>>> type(a)
<type 'unicode'>
>>> a.encode('ascii','ignore')
'aaa'
>>> a.encode('ascii','replace')
'aaa???????'
>>>

Or first encode it into a particular representation that can be stored. There are several common Unicode encodings, such as UTF-16 (uses two bytes for most Unicode characters) or UTF-8 (1-4 bytes / codepoint depending on the character), etc. To convert that string into a particular encoding, you can use:

>>> s= u'£10'
>>> s.encode('utf8')
'\xc2\x9c10'
>>> s.encode('utf16')
'\xff\xfe\x9c\x001\x000\x00'

Upvotes: 1

sancelot
sancelot

Reputation: 2053

This is unicode. You have To decode string when storing and encode when printing

Upvotes: 0

Related Questions