Reputation: 122168
I have a file and it has some header lines, e.g.
header1 lines: somehting something
more headers then
somehting something
----
this is where the data starts
yes data... lots of foo barring bar fooing data.
...
...
I've skipped the header lines by looping and running file.readlines()
, other than looping and concating the rest of the lines, how else can i read the rest of the lines?
x = """header1 lines: somehting something
more headers then
somehting something
----
this is where the data starts
yes data... lots of foo barring bar fooing data.
...
..."""
with open('test.txt','w') as fout:
print>>fout, x
fin = open('test.txt','r')
for _ in range(5): fin.readline();
rest = "\n".join([i for i in fin.readline()])
Upvotes: 1
Views: 5175
Reputation: 1123830
.readlines()
reads the all data in the file, in one go. There are no more lines to read after the first call.
You probably wanted to use .readline()
(no s
, singular) instead:
with open('test.txt','r') as fin:
for _ in range(5): fin.readline()
rest = "\n".join(fin.readlines())
Note that because .readlines()
returns a list already, you don't need to loop over the items. You could also just use .read()
to read in the remainder of the file:
with open('test.txt','r') as fin:
for _ in range(5): fin.readline()
rest = fin.read()
Alternatively, treat the file object as an iterable, and using itertools.islice()
slice the iterable to skip the first five lines:
from itertools import islice
with open('test.txt','r') as fin:
all_but_the_first_five = list(islice(fin, 5, None))
This does produce lines, not one large string, but if you are processing the input file line by line, that usually is preferable anyway. You can loop directly over the slice and handle lines:
with open('test.txt','r') as fin:
for line in list(islice(fin, 5, None)):
# process line, first 5 will have been skipped
Don't mix using a file object as an iterable and .readline()
; the iteration protocol as implemented by file objects uses an internal buffer to ensure efficiency that .readline()
doesn't know about; using .readline()
after iteration is liable to return data further on in the file than you expect.
Upvotes: 5
Reputation: 142216
Skip the first 5 lines:
from itertools import islice
with open('yourfile') as fin:
data = list(islice(fin, 5, None))
# or loop line by line still
for line in islice(fin, 5, None):
print line
Upvotes: 1