Extratoro
Extratoro

Reputation: 15

Change string in file in python

I have a problem changing a file in python. I need to change a string. The file is not a text file but can be edited with a text editor.

Here is my code:

with open(esp,"r") as f:                
            content=f.readlines()         
with open(esp_carsat,"w") as f:
        for line in content:
                f.write(line.replace("201","202")))

Problem is that content is in byte I think. '\xff\xfe<\x00I\x00n\x00s\x00t\x00a\x00n\x00c\x00e\x00N\x00a\x00m\x00e\x00s\x00>\x00\r\x00\n'

So my replace is not working. I tried to play with encoding but the file is not readable afterwards. Furthermore, I have accents in the file (é,è...)

Is there a way to do what I want?

Upvotes: 0

Views: 601

Answers (2)

Martijn Pieters
Martijn Pieters

Reputation: 1124988

You have UTF-16 encoded data. Decode to Unicode text, replace, and then encode back to UTF-16 again:

>>> data = '\xff\xfe<\x00I\x00n\x00s\x00t\x00a\x00n\x00c\x00e\x00N\x00a\x00m\x00e\x00s\x00>\x00\r\x00\n\x00'
>>> data.decode('utf16')
u'<InstanceNames>\r\n'

I had to append an extra \x00 to decode that; by reading the file without decoding Python split the line on the \n and left the \x00 for the next line.

Unicode data can handle accents just fine, no further work required there.

This is easiest done with io.open() to open file objects that do the decoding and encoding for you:

import io

with io.open(esp, "r", encoding='utf16') as f:                
    content=f.readlines()         

with open(esp_carsat, "w", encoding='utf16') as f:
    for line in content:
        f.write(line.replace("201", "202")))

Upvotes: 3

Simeon Visser
Simeon Visser

Reputation: 122526

It's UTF-16-LE data:

>>> b
'\xff\xfe<\x00I\x00n\x00s\x00t\x00a\x00n\x00c\x00e\x00N\x00a\x00m\x00e\x00s\x00>\x00\r\x00\n'
>>> print(b[:-1].decode('utf-16-le'))
<InstanceNames>

Upvotes: 0

Related Questions