Shikhar Srivastava
Shikhar Srivastava

Reputation: 105

Why are file objects their own iterators in python?

I am learning python and this thing is confusing me. Wouldn't it be better if file iterator and file object were different , this way we can support multiple iteration in files. So why are python file objects their own iterators ?

Upvotes: 1

Views: 440

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1123400

Because your OS doesn't let you do this and because disk files are not the only type of I/O Python file objects support. Files are streams of data and handled a lot like other stream sources such as sockets and pipes.

Streams behave exactly like iterators, except a disk file can also support seeking. Network sockets and pipes on the other hand don't support seeking, but to the OS and to Python those are also streams or files.

This abstraction makes it possible to apply a lot of optimisations to file (and stream) handling and has been the de-facto way of looking at files for decades now.

Python could handle disk file objects differently and open up multiple OS file handles for a given Python file object. But that would be rather inefficient; the bottleneck is the communication with the disk, and although your OS can buffer data as it comes from the hard disk, you should generally avoid reading from the same file more than once.

On top of this there are the issues with writing to the disk. The OS already requires you to specify a file mode; reading or writing. You can open a file handle that can do both, but then the OS may have to handle caching in a different way, as it needs to take into account that your file may have been altered when you read the same bits again. Python would have to replicate all this if it were to allow for multiple iterators, where you are also writing to the stream.

Upvotes: 4

Related Questions