jchung
jchung

Reputation: 953

Python UnicodeEncodeError for u'\u2019' while trying to create a CSV or export

I'm trying to export some data to CSV from out of a database, and I'm struggling to understand the following UnicodeEncodeError:

>>> sample
u'I\u2019m now'
>>> type(sample)
<type 'unicode'>
>>> str(sample)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 1: ordinal not in range(128)
>>> print sample
I’m now
>>> sample.encode('utf-8', 'ignore')
'I\xe2\x80\x99m now'

I'm confused. Is it unicode or not? What does the UnicodeEncodeError actually mean in this context? Why does print work just fine? If I want to be able to save this data to a CSV file, how can I handle the encoding so that it does not generate an error when I try to use csv.writer's writerow?

Thanks for your help.

Upvotes: 0

Views: 322

Answers (1)

Ulrich Eckhardt
Ulrich Eckhardt

Reputation: 17415

  1. It is a Python unicode object, you used type(sample) to verify that. Also, it contains Unicode, so you can serialize it to a file that has one of the Unicode encodings.

  2. The encoding error needs to be read carefully: It is the "ascii" codec that can't represent that string. ASCII is just the Unicode subset with codepoints below 127. Your string uses codepoint 0x2019, so it can't be encoded with ASCII.

  3. print works because it is correctly implemented and it doesn't try to encode the string as ASCII. I think you would get similar errors if stdout was set up with e.g. Latin-1 as encoding, but it seems your system can handle a wider range of Unicode than that.

  4. In order to write a CSV file, you could just use UTF-8 as encoding for that file. I haven't used the CSV module though, so I'm not sure exactly how. In any case, if it doesn't work, you should provide the exact code that doesn't as MCVE in a different question.

BTW: Please upgrade to Python 3! It has many improvements over the 2.x series, also concerning string/Unicode handling.

Upvotes: 3

Related Questions