alvas
alvas

Reputation: 122168

How to read the rest of the lines? - python

I have a file and it has some header lines, e.g.

header1 lines: somehting something
more headers then
somehting something
----

this is where the data starts
yes data... lots of foo barring bar fooing data.
...
...

I've skipped the header lines by looping and running file.readlines(), other than looping and concating the rest of the lines, how else can i read the rest of the lines?

x = """header1 lines: somehting something
more headers then
somehting something
----

this is where the data starts
yes data... lots of foo barring bar fooing data.
...
..."""

with open('test.txt','w') as fout:
  print>>fout, x

fin = open('test.txt','r')
for _ in range(5): fin.readline();
rest = "\n".join([i for i in fin.readline()])

Upvotes: 1

Views: 5175

Answers (2)

Martijn Pieters
Martijn Pieters

Reputation: 1123830

.readlines() reads the all data in the file, in one go. There are no more lines to read after the first call.

You probably wanted to use .readline() (no s, singular) instead:

with open('test.txt','r') as fin:
    for _ in range(5): fin.readline()
    rest = "\n".join(fin.readlines())

Note that because .readlines() returns a list already, you don't need to loop over the items. You could also just use .read() to read in the remainder of the file:

with open('test.txt','r') as fin:
    for _ in range(5): fin.readline()
    rest = fin.read()

Alternatively, treat the file object as an iterable, and using itertools.islice() slice the iterable to skip the first five lines:

from itertools import islice

with open('test.txt','r') as fin:
    all_but_the_first_five = list(islice(fin, 5, None))

This does produce lines, not one large string, but if you are processing the input file line by line, that usually is preferable anyway. You can loop directly over the slice and handle lines:

with open('test.txt','r') as fin:
    for line in list(islice(fin, 5, None)):
        # process line, first 5 will have been skipped

Don't mix using a file object as an iterable and .readline(); the iteration protocol as implemented by file objects uses an internal buffer to ensure efficiency that .readline() doesn't know about; using .readline() after iteration is liable to return data further on in the file than you expect.

Upvotes: 5

Jon Clements
Jon Clements

Reputation: 142216

Skip the first 5 lines:

from itertools import islice

with open('yourfile') as fin:
    data = list(islice(fin, 5, None))
    # or loop line by line still
    for line in islice(fin, 5, None):
        print line

Upvotes: 1

Related Questions