Reputation: 2711
In recent Python 3.x, the default behavior of open
is to open files in the universal newlines mode (newline=None
), which means that if I open a file like this:
f = open("file.txt")
then all the different line endings (CRLF/CR/LF) get converted into "\n"
when using f.readline()
, for line in f:
, etc.
However, when dealing with sys.stdin
, the different line endings are not converted (at least on Linux, reading a CRLF file means that the result of sys.stdin.readline()
ends with "\r\n"
). This means that sys.stdin
is opened using a different newline
setting. Is there any way of influencing the parameters that are used for opening sys.stdin
?
More generally, is there in Python something akin to Perl's binmode
that would allow changing the way reading from sys.stdin
/ writing to sys.stdout
works?
Just to be clear, I know that I can do the CRLF→LF conversion myself. That is not what this question is about.
Upvotes: 9
Views: 5914
Reputation: 328
You can reconfigure() sys.stdin with e.g.:
sys.stdin.reconfigure(newline=None)
which should then match the behaviour of any other file opened in text-mode with open().
Beware that reconfiguring must be done before the stream is being read from the first time.
Upvotes: 2
Reputation: 2711
This can be solved as follows:
sys.stdin = io.TextIOWrapper(sys.stdin.buffer)
This causes sys.stdin
to behave as if it were opened using open
with default parameters (including newline=None
). TextIOWrapper's constructor takes the same parameters open
does, so we can also do things like:
sys.stdin = io.TextIOWrapper(sys.stdin.buffer, newline="", encoding="utf-8")
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, newline="\r\n")
etc.
We thus have a way of influencing the way stdin/stdout is opened, similarly to Perl's binmode
(the binmode FILEHANDLE, LAYERS
version).
Thanks to @abarnert for the comment about TextIOWrapper.
Upvotes: 1
Reputation: 365767
There's no way to fully replace how Python wraps the stdin and stdout pipes in sys.stdin
and sys.stdout
. (There is some limited control, like the -u
option`, but that's it.)
However, you're asking for multiple other things, all of which are possible, and some of which might be what you want.
First, sys.stdin
is just a normal TextIOWrapper
, as returned by open
(in text mode). This means you can access its underlying binary object, or the raw unbuffered file object underneath that, or the OS file descriptor underneath that, the same as for any other text file:
sys.stdin.buffer
sys.stdin.buffer.raw
sys.stdin.fileno()
Often, that's all you need. You wouldn't want to actually replace sys.stdin = sys.stdin.buffer
. That would break input
, the fileinput
module, and who knows what else that's expecting stdin
to be a text file. But you can just use sys.stdin.buffer
instead of sys.stdin
, and it's roughly the equivalent of what I think you were looking for from perl's binmode
.
And you want a TextIOWrapper
with different options wrapped around the same underlying file, you can do that too.
For the simplest cases, the easiest way to do that is just to call open
the way you wanted to call open
, passing the file descriptor:
sin = open(sys.stdin.fileno(), <your open arguments here>)
For less trivial cases, you may need to read the TextIOWrapper
docs (and BufferedReader
and other types in the io
module) and wrap exactly the part you want in exactly the way you want. But usually, open
will be good enough.
Either way, you can just use sin
in place of sys.stdin
.
Or, since this now is a perfectly good TextIOWrapper
(assuming you open
ed in text mode), it is safe and reasonable to replace sys.stdin
:
sys.stdin = sin
… and now input
, etc., will continue to work—and will work the way you wanted them to.
Upvotes: 7