Reputation: 89
I want to read from standard input chunk by chunk until EOF. For example, I could have a very large file, and I want to read in and process 1024 bytes at a time from STDIN until EOF is encountered. I've seen sys.stdin.read() which saves everything in memory at once. This isn't feasible because there might not be enough space available to store the entire file. There is also for "line in sys.stdin", but that separates the input by newline only, which is not what I'm looking for. Is there any way to accomplish this in Python?
Upvotes: 4
Views: 4050
Reputation: 2000
Inspired by @Andre's answer, but with python3 code and also handles SIGINT (just because...):
#!/usr/bin/env python3
########
# g.py #
########
import signal
import sys
def process_data(buffer):
sys.stdout.buffer.write(buffer)
sys.stdout.buffer.flush()
def read_stdin_stream(handler, chunk_size=1024):
with sys.stdin as f:
while True:
buffer = f.buffer.read(chunk_size)
if buffer == b'':
break
handler(buffer)
def signal_handler(sig, frame):
sys.stdout.buffer.flush()
sys.exit(0)
def main():
signal.signal(signal.SIGINT, signal_handler)
# notice the `chunk_size` of 1 for this particular example
read_stdin_stream(process_data, chunk_size=1)
if __name__ == "__main__":
main()
Example:
$ for i in $(seq 1 5); do echo -n "$i" && sleep 1; done | python3 g.py
12345
Upvotes: 1
Reputation: 1310
You can read stdin (or any file) in chunks using f.read(n)
, where n
is the integer number of bytes you want to read as an argument. It will return the empty string if there is nothing left in the file.
Upvotes: 1
Reputation: 15537
The read()
method of a file object accepts an optional size
parameter.
If you specify size
, at most size bytes are read and returned. If the end of the file has been reached, f.read() will return an empty string ('').
See the io docs and open() docs.
Pseudo code:
with open('file') as f:
while True:
buffer = f.read(1024) # Returns *at most* 1024 bytes, maybe less
if buffer = '':
break
process_data(buffer)
Upvotes: 3