JStew
JStew

Reputation: 467

How can I write to a CSV without getting a Unicode Error?

I've seen many examples of this issue, but haven't yet found a straightforward solution that's worked for me. I still receive the error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xad' in position 5: ordinal not in range(128)

Here's the code I've put together based on similar questions raised on stackoverflow.

f = open(out_filepath, 'w')
  writer = csv.DictWriter(f, fieldnames, extrasaction='ignore')
  headers = dict([(header, header) for header in fieldnames])
  data = [headers]
  data.extend([row for row in rows]) # add data rows
  for row in data:
    try:
        writer.writerow(row)
    except:
        for value in row.itervalues():
            for s in value:
                try:
                    unicode(s).encode("utf-8")
                except:
                    s = ''
        writer.writerow(row)
  f.close() 

Here's the updated code that I'm trying that's still giving me errors:

for row in data:
    try:
        writer.writerow(row)
    except:
        for key in row:
            value = row[key]
            letterlist = list(value)
            for i in range(len(letterlist)):
                try:
                    letterlist[i].decode('string_escape')
                    letterlist[i].encode('ascii', 'ignore')
                except:
                    print 'Letter excluded from ' +key+' '+ str(letterlist) 
                    letterlist[i] = ''
            value = ''.join(letterlist)
            row[key] = value
        #print row
        writer.writerow(row)

Upvotes: 2

Views: 2101

Answers (2)

JStew
JStew

Reputation: 467

I ended up using this function to convert unicode to utf8.

def ValConvert(val):
  if type(val).__name__ == 'unicode':
    return val.encode('utf8')
  elif type(val).__name__ == 'str':
    return val
  else:
    return str(val)

Upvotes: 0

Ryan
Ryan

Reputation: 3709

Here's what has worked for me:

f = open('eg.csv', 'w')
s = 'some troublesome string'
f.write(s.decode('string_escape')) 

and if that doesn't work I do:

f.write(s.encode('ascii', 'ignore'))

Upvotes: 1

Related Questions