Reputation: 105
I am working under ubuntu on a python3.4 script where I take in parameter a file (encoded to UTF-8), generated under Windows. I have to go through the file line by line (separated by \r\n
) knowing that the "lines" contain some '\n'
that I want to keep.
My problem is that Python transforms the file's "\r\n"
to "\n"
when opening. I've tried to open with different modes ("r"
, "rt"
, "rU"
).
The only solution I found is to work in binary mode and not text mode, opening with the "rb"
mode.
Is there a way to do it without working in binary mode or a proper way to do it?
Upvotes: 5
Views: 1618
Reputation: 31660
From Martijn Pieters the solution is:
with open(filename, "r", newline='\r\n') as f:
This answer was posted as an edit to the question disable the automatic change from \r\n to \n in python by the OP lu1her under CC BY-SA 3.0.
Upvotes: 0
Reputation: 1122172
Set the newline
keyword argument to open()
to '\r\n'
, or perhaps to the empty string:
with open(filename, 'r', encoding='utf-8', newline='\r\n') as f:
This tells Python to only split lines on the \r\n
line terminator; \n
is left untouched in the output. If you set it to ''
instead, \n
is also seen as a line terminator but \r\n
is not translated to \n
.
From the open()
function documentation:
newline controls how universal newlines mode works (it only applies to text mode). It can be
None
,''
,'\n'
,'\r'
, and'\r\n'
. [...] If it is''
, universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.
Bold emphasis mine.
Upvotes: 7