Converting Special Characters in UTF-8 file to cp1252 file in Python

Question

We changed to a hosted web based system which produces UTF-8 encoded files, however we have legacy applications that require ANSI cp1252 encoded files. Converting is not a problem however special french characters in names get munged in the translation into 2 bytes. Not surprising but users are insisting that the french characters be retained.

I wrote a Python program to translate the file as follows:

import io

src_path='0189enr.asc'
dst_path='a-new-file2.txt'
outcontent=""
changes=0

with io.open(src_path, mode="r", encoding="utf8") as fd:
    content = fd.read()

for char in content:
    if char == 'è':
        outchar=138
        outcontent=outcontent+chr(outchar)
    if char == 'ô':
        outchar=147
        outcontent=outcontent+chr(outchar)
    if char == 'é':
        outchar=130
        outcontent=outcontent+chr(outchar)
    else:
        outcontent=outcontent+char

with io.open(dst_path, mode="w", encoding="cp1252") as fd:
    fd.write(outcontent)

However the program is failing on the fd.write() with the error:

UnicodeEncodeError: 'charmap' codec can't encode character '\x82' in position 300132: character maps to

I'm stuck - how can I modify these specific characters and produced an cp1252 encoded file? Any help would be appreciated thanks!

Converting Special Characters in UTF-8 file to cp1252 file in Python

However the program is failing on the fd.write() with the error:

Answers (1)

Related Questions