disable the automatic change from
to
in python

Question

I am working under ubuntu on a python3.4 script where I take in parameter a file (encoded to UTF-8), generated under Windows. I have to go through the file line by line (separated by ) knowing that the "lines" contain some ' ' that I want to keep.

My problem is that Python transforms the file's " " to " " when opening. I've tried to open with different modes ("r", "rt", "rU").

The only solution I found is to work in binary mode and not text mode, opening with the "rb" mode.

Is there a way to do it without working in binary mode or a proper way to do it?

Martijn Pieters · Accepted Answer

Set the newline keyword argument to open() to ' ', or perhaps to the empty string:

with open(filename, 'r', encoding='utf-8', newline='
') as f:

This tells Python to only split lines on the line terminator; is left untouched in the output. If you set it to '' instead, is also seen as a line terminator but is not translated to .

From the open() function documentation:

newline controls how universal newlines mode works (it only applies to text mode). It can be None, '', ' ', ' ', and ' '. [...] If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.

Bold emphasis mine.

disable the automatic change from \r\n to \n in python

Answers (2)

Related Questions