user2920520
user2920520

Reputation: 21

Ascii error when trying to write in file open with codecs lib

Code

# -*- coding: ISO-8859-15 -*-
import sys
import codecs
filename2 = "log_unicode2.log"
log_file2 = codecs.open(filename2, "w", "utf-8")
sys.stdout = log_file2
log_file2.write('aééé')

Error

Traceback (most recent call last):
  File "snippet_problem_unicode.py", line 7, in <module>
    log_file2.write('a├®├®├®')
  File "C:\Users\dev1\Envs\atao\lib\codecs.py", line 691, in write
    return self.writer.write(data)
  File "C:\Users\dev1\Envs\atao\lib\codecs.py", line 351, in write
    data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal
not in range(128)

Contexte

'aééé' is a byte string (latin-1) which need to be converted to utf-8. Why do this conversion involve ascii codec ?

Upvotes: 1

Views: 516

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1123510

You are writing a byte string to a file object expecting a unicode value. To go from byte string to unicode value Python has to decode the bytestring. This decoding uses the default ASCII codec.

Either:

  1. Use a unicode literal instead of a byte string:

    log_file2.write(u'aééé')
    
  2. Explicitly decode the bytestring to Unicode first, using your source file encoding:

    log_file2.write('aééé'.decode('latin1'))
    
  3. Not use codecs.open() but open the file using the built-in open() function instead, then manually decode, then encode to UTF:

    log_file2 = open(filename2, "w")
    log_file2 .write('aééé'.decode('latin1').encode('utf8')
    

    or use a unicode literal and encode manually:

    log_file2 = open(filename2, "w")
    log_file2 .write(u'aééé'.encode('utf8'))
    

Upvotes: 1

Related Questions