Max Li
Max Li

Reputation: 5199

strange behaviour of filehandle.tell() function

I don't get why the tell() function doesn't work in this case. Let's create a file with the string "1\n2\n3\n4\n" inside:

f=open('test.tmp','w')
f.write('1\n2\n3\n4\n')
f.close()

Now, let's open it and run the following code:

fTellResults=[]
f=open('test.tmp','r+')
for line in f:
    fTellResults.append(f.tell())
f.close()
print fTellResults

As a result I get:

[8L, 8L, 8L, 8L]

However, I would expect rather this:

[2L, 4L, 6L, 8L]

Could anyone explain me why it works like this and how could I get the expected result?

p.s. I use Python 2.7.1 on Linux

Upvotes: 2

Views: 194

Answers (2)

Felix Loether
Felix Loether

Reputation: 6188

file.next()

A file object is its own iterator, for example iter(f) returns f (unless f is closed). When a file is used as an iterator, typically in a for loop (for example, for line in f: print line), the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit when the file is open for reading (behavior is undefined when the file is open for writing). In order to make a for loop the most efficient way of looping over the lines of a file (a very common operation), the next() method uses a hidden read-ahead buffer. As a consequence of using a read-ahead buffer, combining next() with other file methods (like readline()) does not work right. However, using seek() to reposition the file to an absolute position will flush the read-ahead buffer.

Based on this I claim the position given by file.tell is incorrect because the file was already read to the read-ahead buffer.

Upvotes: 3

TJD
TJD

Reputation: 11896

The problem is that for line in f: causes all lines to be read before executing the loop. So then in each iteration of the loop, tell() just stays constant at the end of the file. For the desired behavior, you need to do readline inside the loop

Upvotes: 1

Related Questions