Reputation: 21

Python 2.7: how to read only a few lines at a time from a file?

For example, I have 2,000 lines in a file, and I want to read 500 lines at a time and do something with these 500 lines before reading another 500 lines. I wonder if anyone would write some quick code for me to learn. Thanks!

Upvotes: 2

Answers (3)

Artsiom Rudzenka

Reputation: 29113

Correct me but i think that this very basic sample will work too:

linesToProceed = 500
with open(filename, 'r') as f:
    lines = []
    for i,line in enumerate(f):
        if (i + 1) % linesToProceed:
            # do something with lines in lines
            lines = []
        else:
            lines.append(line)

Upvotes: 0

Zach Kelling

Reputation: 53829

You could also use itertools.islice to read 500 lines at a time:

lines = itertools.islice(file_obj, 500)

Upvotes: 0

dcrosta

Reputation: 26258

You could use a generator to group the lines together, and yield them in a way that is convenient to use in a simple for loop. This might get you started:

def chunks_of(iterable, chunk_size=500):
    out = []
    for item in iterable:
        out.append(item)
        if len(out) >= chunk_size:
            yield out
            out = []
    if out:
        yield out

You can then use this like:

for chunk_of_lines in chunks_of(file('/path/to/file'), chunk_size=500):
    # chunk_of_lines is 500 or fewer lines from the file

(Why "500 or fewer"? Because the last chunk might not be 500 lines if the number of lines in the file was not an even multiple of 500.)

Edit: Always check the docs first. Here's a recipe from the itertools docs

def grouper(n, iterable, fillvalue=None):
    "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

This creates a list of n iterators on the iterable (in this case, the file object) -- since they are all iterators on the same underlying object, when one advances, the rest will all advance as well -- and then zips their result. izip_longest works like izip, but pads its results with the fillvalue, rather than simply omitting them, as my chunks_of function does.

Upvotes: 7

Python 2.7: how to read only a few lines at a time from a file?

Answers (3)

Related Questions