Handle UnicodeEncodeError in python2.7

Question

I have the following code :

    for index, row in df_out.iterrows():
        yield {
               'CustomerName': str(row['CustomerName'])
              }

and I get the the UnicodeEncodeError:

RuntimeError: UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 10: ordinal not in range(128)

How can I handle this part in order to avoid that error?

str(row['CustomerName'])

snakecharmerb · Accepted Answer

If you are potentially dealing with non-ASCII text in python2, then doing

str(some_text)

is usually bad idea, because you will get a UnicodeEncodeError if some_text contains non-ASCII characters. The correct code would be

unicode(some_text)

as unicode() will not try to encode your text as ASCII.

However given this code

for index, row in df_out.iterrows():
    yield {
           'CustomerName': str(row['CustomerName'])
          }

it's very likely that row['CustomerName'] is already a unicode object, so calling unicode on it would be redundant. This will probably work:

for index, row in df_out.iterrows():
    yield {
           'CustomerName': row['CustomerName']
          }

To summarise: remove the str call. If that doesn't work, try replacing str with unicode.

Answers (1)