Anonymous
Anonymous

Reputation: 305

What is the difference between these two ways to read lines in Python with `sys.stdin`?

I don't think I understand the ways to read lines from the input using sys.stdin .

What is the difference between

import sys
while True:
    foo(sys.stdin.readline())

and

import sys
for line in sys.stdin:
    foo(line)

and why would I pick one choice over the other?

Also, how would I get the behavior of

import sys
first_line = sys.readline()
foo(first_line)
while True:
    bar(sys.readline())

by using a for-in loop? Specifically, what would be an elegant way to treat the first line separately from the other lines in the input? Does somethine along the lines of for line in sys.stdin still work?

Upvotes: 0

Views: 925

Answers (2)

user149341
user149341

Reputation:

while True:
    foo(sys.stdin.readline())

This code will loop forever. If there is an EOF on sys.stdin -- for instance, if input was redirected from a file, and the end of that file has been reached -- then it will call foo('') repeatedly. This is probably bad.

for line in sys.stdin:
    foo(line)

This code will stop looping when an EOF is encountered. This is good.

If you want to handle the first line differently, you can simply call sys.stdin.readline() once before entering the loop:

first_line = sys.readline()
foo(first_line)
for line in sys.stdin:
    bar(line)

Upvotes: 2

abarnert
abarnert

Reputation: 365657

There's nothing special about sys.stdin here; it's just a normal text file object.

Iterating any iterable, including a file object, with for x in iterable:, just calls next on it over and over until it raises a StopIteration.

Notice that this means that if you want to skip over a header line before processing the rest of a file, you can just call next(f) before the loop.

And readline does the same thing as next, except for the hint parameter (which you're not using), and what happens on various error conditions (which aren't likely to matter here), and what happens at EOF: readline returns an empty string, next raises a StopIteration.

So, there's no general reason to pick one over the other in general; it comes down to which is more readable in your particular case.


If your goal is to loop over all the lines, it's a lot more readable to use a for loop. Compare:

for line in sys.stdin:
    do_stuff(line)

while True:
    line = sys.stdin.readline()
    if not line:
        break
    do_stuff(line)

If, on the other hand, your loop involves reading variable chunks of stuff with some non-trivial logic, readline is usually going to be clearer:

while True:
    line = sys.stdin.readline()
    if not line:
        break
    while line.rstrip().endswith('\\'):
        line = line.rstrip().rstrip('\\') + sys.stdin.readline()
    do_stuff(line)

logical_line = ''
try:
    for line in sys.stdin:
        if logical_line:
            logical_line += line
        if not line.rstrip().endswith('\\'):
            do_stuff(logical_line)
            logical_line = ''
except StopIteration:
    if logical_line:
        do_stuff(logical_line)

Upvotes: 2

Related Questions