SSS
SSS

Reputation: 2452

Does for line in File read entire file

Does the following code read one line for each loop or does it read the entire file into memory first before beginning the iteration?

for line in f:
    print(line)

My intentions are to read a single line from the file.

Upvotes: 4

Views: 2022

Answers (4)

Mark Ransom
Mark Ransom

Reputation: 308412

If all you need to do is read a single line, and it's followed by binary data, you will need to open the file in binary mode anyway. It's easy then to emulate what Python does when it reads a line: read into a temporary buffer and search for the linefeed character. I'm assuming the text is in an 8-bit ASCII-compatible encoding. You'll need to choose some reasonable maximum line length for max_line_size or the algorithm gets a lot more complicated.

with open(filename, 'rb') as f:
    buffer = f.read(max_line_size)
    len = buffer.find(b'\n')
    if len < 0:
        raise RuntimeError('Line in file too long')
    line = buffer[:len]
    line = line.decode()
    f.seek(len + 1)

Upvotes: 0

Serge Ballesta
Serge Ballesta

Reputation: 149085

You cannot be sure. All you can know is that it will return one line at a time. The Python Standard Library documentation says : In order to make a for loop the most efficient way of looping over the lines of a file (a very common operation), the next() method uses a hidden read-ahead buffer. As a consequence of using a read-ahead buffer, combining next() with other file methods (like readline()) does not work right.

My understanding is that the read-ahead buffer loads a full chunk (undetermined size) and looks for end of line in that buffer. But for a small file (few ko), you can be sure that there will be only one physical read. I once tried to put a read after getting first line with next on a small file (about 50 lines) and found the file pointer at end of file.

Of course for a really big file, it will be read physically one chunk at a time, and python memory will use one single line at a time. So it will be far more conservative than readlines(). But afterall, on common systems (Unix-like, Mac OS or Windows) the underlying read system call on a file(*) has no notion of end of line and can only read a (maximum) number of bytes. So there is no way on those systems to physically read up to an end of line, whatever language you use. You can only have utilities that load an internal buffer and then look for the end of line in that buffer. That's what next() method does for a file object in Python.

After your comments, I understand that you try to get only first line. You can do it with :

line = f.next()

But do not try to use any read method after that because as I explained above the file pointer may be far beyond the end of first line.

(*) it would not be the same when reading from a console or a terminal device ...

Upvotes: 5

ha9u63a7
ha9u63a7

Reputation: 6854

You can do either that or this:

f = open(' a file');

s = f.readlines(): # Read all lines, no looping

This is mentioned in Python docs. There is also this list(f) that makes you list the lines as items in a list

Upvotes: -3

TigerhawkT3
TigerhawkT3

Reputation: 49330

It works with one line at a time instead of reading the whole thing into memory at once. That's why it's recommended so often.

Upvotes: -1

Related Questions