Ana Nimbus
Ana Nimbus

Reputation: 735

What is a Pythonic way to detect that the next read will produce an EOF in Python 3 (and Python 2)

Currently, I am using

def eofapproached(f):
    pos  = f.tell()
    near = f.read(1) == ''
    f.seek(pos)
    return near

to detect if a file open in 'r' mode (the default) is "at EOF" in the sense that the next read would produce the EOF condition.

I might use it like so:

f = open('filename.ext') # default 'r' mode

print(eofapproached(f))

FYI, I am working with some existing code that stops when EOF occurs, and I want my code to do some action just before that happens.

I am also interested in any suggestions for a better (e.g., more concise) function name. I thought of eofnear, but that does not necessarily convey as specific a meaning.

Currently, I use Python 3, but I may be forced to use Python 2 (part of a legacy system) in the future.

Upvotes: 3

Views: 255

Answers (2)

GIZ
GIZ

Reputation: 4633

I've formulated this code to avoid the use of tell (perhaps using tell is simpler):

import os

class NearEOFException(Exception): pass  

def tellMe_before_EOF(filePath, chunk_size):
    fileSize = os.path.getsize(filePath)
    chunks_num = (fileSize // chunk_size)    # how many chunks can we read from file?
    reads = 0                               # how many chunks we read so far

    f = open(filePath)

    if chunks_num == 0:
        raise NearEOFException("File is near EOF")

    for i in range(chunks_num-1):
        yield f.read(chunk_size)
    else:
        raise NearEOFException("File is near EOF")


if __name__ == "__main__":
    g = tellMe_before_EOF("xyz", 3)   # read in chunks of 3 chars
    while True:
        print(next(g), end='')       # near EOF raise NearEOFException

The naming of the function is disputed. It's boring to name things, I'm just not good at that.

The function works like this: take the size of the file and see approximately how many times can we read N sized chunks and store it in chunks_num. This simple division gets us near EOF, the question is where do you think near EOF is? Near the last char for example or near the last nth characters? Maybe that's something to keep in mind if it matters.

Trace through this code to see how it works.

Upvotes: 0

Baldrickk
Baldrickk

Reputation: 4409

You can use f.tell() to find out your current position in the file.

The problem is, that you need to find out how big the file is. The niave (and efficient) solution is os.path.getsize(filepath) and compare that to the result of tell() but that will return the size in bytes, which is only relavent if reading in binary mode ('rb') as your file may have multi-byte characters.

Your best solution is to seek to the end and back to find out the size.

def char_count(f):
    current = f.tell()
    f.seek(0, 2)
    end = f.tell()
    f.seek(current)
    return end

def chars_left(f, length=None):
    if not length:
        length = char_count(f)
    return length - f.tell()

Preferably, run char_count once at the beginning, and then pass that into chars_left. Seeking isn't efficient, but you need to know how long your file is in characters and the only way is by reading it.

If you are reading line by line, and want to know before reading the last line, you also have to know how long your last line is to see if you are at the beginning of the last line.
If you are reading line by line, and only want to know if the next line read will result in an EOF, then when chars_left(f, total) == 0 you know you are there (no more lines left to read)

Upvotes: 1

Related Questions