Reputation: 5029
I want to generate a bunch of files based on a template. The template has thousands of lines. For each of the new files, only top 5 lines are different. What is the best way of reading all the lines but first 5 at once instead of read the whole file in line by line?
Upvotes: 1
Views: 94
Reputation: 140148
One approach would be to create a list of the 5 first lines, and read the rest in a big buffer:
with open("input.txt") as f:
first_lines = [f.readline() for _ in range(5)]
rest_of_lines = f.read()
or more symmetrical for the first part: create 1 small buffer with the 5 lines:
first_lines = "".join([f.readline() for _ in range(5)])
As an alternative, from a purely I/O point of view, the quickest would be
with open("input.txt") as f:
lines = f.read()
and use a line split generator to read the 5 first lines (splitlines()
would be disastrous in terms of memory copy, find an implementation here)
Upvotes: 3
Reputation: 11075
File objects in python are quite conveniently their own iterator objects so that when you call for line in f: ...
you get the file line by line. The file object has what's generally referred to as a cursor that keeps track of where you're reading from. when you use the generic for
loop, this cursor advances to the next newline each time and returns what it has read. If you interrupt this loop before the end of the file, you can pick back up where you left off with another loop or just a call to f.read()
to read the rest of the file
with open(inputfile, 'r') as f:
lineN = 0
header = ""
for line in f:
header = header + line
lineN += 1
if lineN >= 4: #read first 5 lines (0 indexed)
break
body = f.read() #read the rest of the file
Upvotes: 1