Connor Leech
Connor Leech

Reputation: 18834

grab tweet with python encounters UnicodeEncodeError

I am trying to grab the text of tweets using the Twittter API and Python

I use oauth to log in and get the resulting dictionary with:

jsonTweets = json.loads(response)
list = jsonTweets["statuses"]   # list of dictionaries

type(jsonTweets)  #returns dict
type(list)    #returns list
type(list[0])    #return dict (it's a list of dictionaries)

list[0] is a dictionary:

{u'contributors': None, u'truncated': False, u'text': u'RT @Kagame_quotes: "We, the people of #Rwanda, our country has its own problems that we can\u2019t attribute to others, we need to find solution\u2026', u'in_reply_to_status_id': None, u'id': 387905246028394496L, u'favorite_count': 0, u'source': u'<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', u'retweeted': False, u'coordinates': None, etc...

I only want to grab the value for the u'text' key (ie get the tweet)

so I write:

for item in list:
    print item[u'text']

But that gives me the error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019'
in position 91: ordinal not in range(128)

How can I grab the value for the u'text' key?

Upvotes: 0

Views: 762

Answers (2)

Giacomo d&#39;Antonio
Giacomo d&#39;Antonio

Reputation: 2275

There's nothing wrong with your text. It just contains unicode characters, which you can't print on your consolle.

In particular (check out this http://www.utf8-chartable.de/unicode-utf8-table.pl):

  • U+2019 RIGHT SINGLE QUOTATION MARK
  • U+2026 HORIZONTAL ELLIPSIS

Upvotes: 0

Ben
Ben

Reputation: 4331

You need to specify UTF-8 encoding:

for item in list:
    print item[u'text'].encode('utf-8')

That should do the trick.

Upvotes: 1

Related Questions