David Andrews
David Andrews

Reputation: 89

Python: How to read from stdin by byte chunks until EOF?

I want to read from standard input chunk by chunk until EOF. For example, I could have a very large file, and I want to read in and process 1024 bytes at a time from STDIN until EOF is encountered. I've seen sys.stdin.read() which saves everything in memory at once. This isn't feasible because there might not be enough space available to store the entire file. There is also for "line in sys.stdin", but that separates the input by newline only, which is not what I'm looking for. Is there any way to accomplish this in Python?

Upvotes: 4

Views: 4050

Answers (3)

gmagno
gmagno

Reputation: 2000

Inspired by @Andre's answer, but with python3 code and also handles SIGINT (just because...):

#!/usr/bin/env python3

########
# g.py #
########

import signal
import sys


def process_data(buffer):
    sys.stdout.buffer.write(buffer)
    sys.stdout.buffer.flush()


def read_stdin_stream(handler, chunk_size=1024):
    with sys.stdin as f:
        while True:
            buffer = f.buffer.read(chunk_size)
            if buffer == b'':
                break
            handler(buffer)


def signal_handler(sig, frame):
    sys.stdout.buffer.flush()
    sys.exit(0)


def main():
    signal.signal(signal.SIGINT, signal_handler)

    # notice the `chunk_size` of 1 for this particular example
    read_stdin_stream(process_data, chunk_size=1)


if __name__ == "__main__":
    main()

Example:

$ for i in $(seq 1 5); do echo -n "$i" && sleep 1; done | python3 g.py
12345

Upvotes: 1

Izaak Weiss
Izaak Weiss

Reputation: 1310

You can read stdin (or any file) in chunks using f.read(n), where n is the integer number of bytes you want to read as an argument. It will return the empty string if there is nothing left in the file.

Upvotes: 1

André Laszlo
André Laszlo

Reputation: 15537

The read() method of a file object accepts an optional size parameter.

If you specify size, at most size bytes are read and returned. If the end of the file has been reached, f.read() will return an empty string ('').

See the io docs and open() docs.

Pseudo code:

with open('file') as f:
    while True:
        buffer = f.read(1024) # Returns *at most* 1024 bytes, maybe less
        if buffer = '':
            break
        process_data(buffer)

Upvotes: 3

Related Questions