Change string in file in python

Question

I have a problem changing a file in python. I need to change a string. The file is not a text file but can be edited with a text editor.

Here is my code:

with open(esp,"r") as f:                
            content=f.readlines()         
with open(esp_carsat,"w") as f:
        for line in content:
                f.write(line.replace("201","202")))

Problem is that content is in byte I think. '\xff\xfe<\x00I\x00n\x00s\x00t\x00a\x00n\x00c\x00e\x00N\x00a\x00m\x00e\x00s\x00>\x00 \x00 '

So my replace is not working. I tried to play with encoding but the file is not readable afterwards. Furthermore, I have accents in the file (é,è...)

Is there a way to do what I want?

Martijn Pieters · Accepted Answer

You have UTF-16 encoded data. Decode to Unicode text, replace, and then encode back to UTF-16 again:

>>> data = '\xff\xfe<\x00I\x00n\x00s\x00t\x00a\x00n\x00c\x00e\x00N\x00a\x00m\x00e\x00s\x00>\x00
\x00
\x00'
>>> data.decode('utf16')
u'
'

I had to append an extra \x00 to decode that; by reading the file without decoding Python split the line on the and left the \x00 for the next line.

Unicode data can handle accents just fine, no further work required there.

This is easiest done with io.open() to open file objects that do the decoding and encoding for you:

import io

with io.open(esp, "r", encoding='utf16') as f:                
    content=f.readlines()         

with open(esp_carsat, "w", encoding='utf16') as f:
    for line in content:
        f.write(line.replace("201", "202")))

Change string in file in python

Answers (2)

Related Questions