Blue Moon
Blue Moon

Reputation: 4671

pandas dataframe and u'\u2019'

I have a pandas dataframe (python 2.7) containing a u'\u2019' that does not let me extract as csv my result.

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 180: ordinal not in range(128)

Is there a way to query the dataframe and substitude these character with another one?

Upvotes: 0

Views: 2442

Answers (2)

Blue Moon
Blue Moon

Reputation: 4671

I did not manage to export the whole file. However, I managed to identity the row with the character causing problems and eliminate it

faulty_rows = []
for i in range(len(outcome)):
    try:
        test = outcome.iloc[i]
        test.to_csv("/Users/john/test/test.csv")
    except:
        pass
        faulty_rows.append(i)
        print i


tocsv = tocsv.drop(outcome.index[[indexes]])    

tocsv.to_csv("/Users/john/test/test.csv")

Upvotes: 0

jarandaf
jarandaf

Reputation: 4427

Try using a different encoding when saving to file (the default in pandas for Python 2.x is ascii, that's why you get the error since it can't handle unicode characters):

df.to_csv(path, encoding='utf-8')

Upvotes: 1

Related Questions