Milutin
Milutin

Reputation: 33

Twitter API Python Character Encoding

I am experimenting with the Twitter API for Python and have run into a character encoding/decoding issue; when I am collecting tweets for a user (@BBCWorld in this instance), if there is special punctuation I receive the following error:

286952044814794753 :  Traceback (most recent call last):
  File "C:\Python27\lib\encodings\cp850.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u201c' in position 0: character maps to <undefined>

Note: The long number at the start is the ID of the tweet causing the error.

The specific character that is causing this problem is an angular (opening) double quotation mark (like those used in MS-Word). Is there a way to display such punctuation in a compatible form? Ideally I want to sanitise tweets to overcome this kind of error by use of replacement, therefore maintaining context, rather that omitting characters.

This is the core of the code:

tweets=api.GetUserTimeline('BBCWorld') 
try: 
    for tweet in tweets: 
        print tweet.id, ": ", (tweet.text) 
except UnicodeEncodeError as uee: 
    print uee

Thanks for any pointers,

Milutin

Upvotes: 3

Views: 2442

Answers (1)

t.dubrownik
t.dubrownik

Reputation: 13432

This problem does not seem to be an issue of python-twitter or python for that matter - it's a problem with Windows cmd.

If you try this under a suitable Unix terminal, this is what you get:

>>> import twitter
>>> api = twitter.Api()
>>> print api.GetStatus('286952044814794753').text
“How do you change mindsets at a societal level, in a country of 1.2bn people?” - Viewpoints from India http://t.co/RiP4t71q #Delhigangrape

Take a look at this question for a discussion of how to deal with this under Windows: Unicode not printing correctly to cp850 (cp437), play card suits

My best bet for you would be to change your console font and codepage to a unicode compliant, as outlined here: https://stackoverflow.com/a/4234515/679897 or here: http://www.velocityreviews.com/forums/t717717-python-unicode-and-windows-cmd-exe.html

Upvotes: 3

Related Questions