Reputation: 1252
I'm sure I'm missing something obvious and probably asked before but I can't seem to get the right combination of keywords together to give me an answer.
How can I write out the first n lines of a file (in effect, the opposite of file.readlines()[0:10]
)?
e.g. I have a function that takes in an input file, and needs to process information from the latter part, throwing out a header. However I want to keep the multi-line header, to be put back in to an output file.
def readInfile(infile):
with open(infile, 'r') as ifh:
# Skip exta info at top of file
header = ifh.readline()[0:10] # Keep the header for later?
noheader = ifh.readlines()[11:]
for line in noheader:
# Do the useful stuff
usefulstuff = foo()
return usefulstuff, header
Then later I want to write out in the format of the input file, using their header:
print(header)
for thing in usefulstuff:
print(thing)
Is there a method I'm missing, or is readlines
no good for this as it returns a list?
I assumed
for line in header:
print(line)
would work, but it doesn't seem to in this case - so I must be doing something wrong?
EDIT
Why does trying to use readlines()[]
twice fail for the second range?
I fixed the code as @pbuck pointed out, that the header line should have been readlines()
not readline
but now the noheader
variable is empty? Do I really have to open the file twice?!
Upvotes: 1
Views: 13646
Reputation: 1216
I have checked on your solution and it seams you are on track. Consider this solution using mmap python package (https://docs.python.org/2/library/mmap.html) where you can treat the file as a string as well as a file. Here is my solution:
import mmap
def main(offset):
with open("pks.txt","r+b") as fd:
#Get the lines to skip
try:
skip=fd.readlines()[0:offset]
lines=sum([len(x) for x in skip])
rfile=mmap.mmap(fd.fileno(),0)
rfile.seek(lines)
print("Header: %s"%skip)
print("Other lines:")
line=rfile.readline()
usefulStuff=list()
while (len(line)>0):
usefulStuff.append(line.lstrip()) #Remove new line
line=rfile.readline()
return usefulStuff,skip
except TypeError as e:
#Handle this error when offset is greater than the file length
print("Error: %s"%str(e))
return None,None
if __name__=='__main__':
footer,header=main(3)
print("Header: %s\nFooter: %s"%(header,footer))
Upvotes: 0
Reputation: 123473
There aren't two readlines()
calls. Initially you call readline()
which reads a single line from the file. Next you call readlines()
and ignore the first 10 lines of the list it returns.
This would be a better way to do it:
def foo(lines):
return ['foo: ' + line for line in lines]
def readInfile(infile):
with open(infile, 'r') as ifh:
lines = ifh.read().splitlines(False) # read in the whole file, separate into lines
header = lines[:10]
usefulstuff = foo(lines[10:])
return usefulstuff, header
usefulstuff, header = readInfile('name_of_file.txt')
for line in header:
print(line)
for line in usefulstuff:
print(line)
Upvotes: 0
Reputation: 40894
Literally, read first n lines, then stop.
def read_first_lines(filename, limit):
result = []
with open(filename, 'r') as input_file:
# files are iterable, you can have a for-loop over a file.
for line_number, line in enumerate(input_file):
if line_number > limit: # line_number starts at 0.
break
result.append(line)
return result
Upvotes: 3
Reputation: 449
Careful there, readline()
returns a string, so ifh.readline()[0:10]
is giving you the first few characters of the first line, and noheader = ifh.readline()[11:]
gives you part of the next line.
What you could do is use loops like so:
header = ""
for i in range(10):
header += ifh.readline()
Or as @pbuck suggests in their comment, use readlines()
(note the s), which returns a list containing each line in your file, which looks more like what you were trying to do.
Upvotes: 2