stretch
stretch

Reputation: 231

Python: saving non ascii characters to file

I'm trying to make a function which prints to the command prompt and to a file. I get encoding/decoding errors with the following code:

import os

def pas(stringToProcess): #printAndSave
  print stringToProcess 
  try: f = open('file', 'a')
  except: f = open('file', 'wb')
  print  >> f, stringToProcess
  f.close()

all = {u'title': u'Pi\xf1ata', u'albumname': u'New Clear War {EP}', u'artistname': u'Montgomery'}

pas(all['title'])

I get the following output:

Piñata
Traceback (most recent call last):
  File "new.py", line 17, in <module>
     pas(all['title'])
  File "new.py", line 11, in pas
    print  >> f, stringToProcess
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 2: ordinal not in range(128)

I've tried all the encode()/decode() permutations I can imagine from similar answers on here, without success. How can this error be solved?

Upvotes: 3

Views: 1905

Answers (3)

Bestasttung
Bestasttung

Reputation: 2458

I've just done this and it works, I read an interesting question.

Encoding is always a bit tricky :

def pas(stringToProcess): #printAndSave
    strtp = stringToProcess.encode('utf-8')
    print stringToProcess
    try: f = open('file.txt', 'a')
    except: f = open('file.txt', 'wb')
    f.write(strtp)
    f.close()

all = {u'title': u'Pi\xf1ata', u'albumname': u'New Clear War {EP}', u'artistname': u'Montgomery'}

pas(all['title'])

Upvotes: 1

Bhargav Rao
Bhargav Rao

Reputation: 52151

Use sys.setdefaultencoding('utf8') to prevent the error from occuring.

That is

import os,sys
reload(sys)  
sys.setdefaultencoding('utf8')
def pas(stringToProcess): #printAndSave
  print stringToProcess 
  try: f = open('file', 'a')
  except: f = open('file', 'wb')
  print  >> f, stringToProcess
  f.close()

all = {u'title': u'Pi\xf1ata', u'albumname': u'New Clear War {EP}', u'artistname': u'Montgomery'}

pas(all['title'])

This would print

Piñata

Upvotes: 3

csl
csl

Reputation: 11368

As someone commented, you probably just need to specify which codec to use when writing the string. E.g., this works for me:

def pas(s):
    print(s)
    with open("file", "at") as f:
        f.write("%s\n" % s.encode("utf-8"))

pas(u'Pi\xf1ata')
pas(u'Pi\xf1ata')

As you can see, I specifically open the file in append/text mode. If the file doesn't exist, it will be created. I also use with instead of your try-except method. This is merely the style I prefer.

As Bhargav says, you can also set the default encoding. It all depends on how much control you need in your program and both ways are fine.

Upvotes: 3

Related Questions