nish
nish

Reputation: 325

How does python read lines from file

Consider the following simple python code:

f=open('raw1', 'r')
i=1
for line in f:
    line1=line.split()
    for word in line1:
        print word,
print '\n'

In the first for loop i.e "for line in f:", how does python know that I want to read a line and not a word or a character?

The second loop is clearer as line1 is a list. So the second loop will iterate over the list elemnts.

Upvotes: 1

Views: 289

Answers (2)

daniel gratzer
daniel gratzer

Reputation: 53871

Python has a notation of what are called "iterables". They're things that know how to let you traverse some data they hold. Some common iterators are lists, sets, dicts, pretty much every data structure. Files are no exception to this.

The way things become iterable is by defining a method to return an object with a next method. This next method is meant to be called repeatedly and return the next piece of data each time. The for foo in bar loops actually are just calling the next method repeatedly behind the scenes.

For files, the next method returns lines, that's it. It doesn't "know" that you want lines, it's just always going to return lines. The reason for this is that ~50% of cases involving file traversal are by line, and if you want words,

 for word in (word for line in f for word in line.split(' ')):
     ...

works just fine.

Upvotes: 4

Prahalad Deshpande
Prahalad Deshpande

Reputation: 4767

In python the for..in syntax is used over iterables (elements tht can be iterated upon). For a file object, the iterator is the file itself.

Please refer here to the documentation of next() method - excerpt pasted below:

A file object is its own iterator, for example iter(f) returns f (unless f is closed). When a file is used as an iterator, typically in a for loop (for example, for line in f: print line), the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit when the file is open for reading (behavior is undefined when the file is open for writing). In order to make a for loop the most efficient way of looping over the lines of a file (a very common operation), the next() method uses a hidden read-ahead buffer. As a consequence of using a read-ahead buffer, combining next() with other file methods (like readline()) does not work right. However, using seek() to reposition the file to an absolute position will flush the read-ahead buffer. New in version 2.3.

Upvotes: 3

Related Questions