stefanB
stefanB

Reputation: 79830

Elegant way to skip first line when using python fileinput module?

Is there an elegant way of skipping first line of file when using python fileinput module?

I have data file with nicely formated data but the first line is header. Using fileinput I would have to include check and discard line if the line does not seem to contain data.

The problem is that it would apply the same check for the rest of the file. With read() you can open file, read first line then go to loop over the rest of the file. Is there similar trick with fileinput?

Is there an elegant way to skip processing of the first line?

Example code:

import fileinput

# how to skip first line elegantly?

for line in fileinput.input(["file.dat"]):
    data = proces_line(line);
    output(data)

Upvotes: 14

Views: 16469

Answers (6)

fivethous
fivethous

Reputation: 79

Do two loops where the first one calls break immediately.

with fileinput.input(files=files, mode='rU', inplace=True) as f:

    for line in f:
        # add print() here if you only want to empty the line
        break

    for line in f:
        process(line)

Lets say you want to remove or empty all of the first 5 lines.

with fileinput.input(files=files, mode='rU', inplace=True) as f:

    for line in f:
        # add print() here if you only want to empty the first 5 lines
        if f._filelineno == 5:
            break

    for line in f:
        process(line)

But if you only want to get rid of the first line, just use next before the loop to remove the first line.

with fileinput.input(files=files, mode='rU', inplace=True) as f:
    next(f)
    for line in f:
        process(line)

Upvotes: 0

Aivar Paalberg
Aivar Paalberg

Reputation: 5149

One option is to use openhook:

The openhook, when given, must be a function that takes two arguments, filename and mode, and returns an accordingly opened file-like object. You cannot use inplace and openhook together.

One could create helper function skip_header and use it as openhook, something like:

import fileinput

files = ['file_1', 'file_2']

def skip_header(filename, mode):
    f = open(filename, mode)
    next(f)
    return f


for line in fileinput.input(files=files, openhook=skip_header):
    # do something

Upvotes: 0

user3698773
user3698773

Reputation: 1039

with open(file) as j: #open file as j
    for i in j.readlines()[1:]: #start reading j from second line.

Upvotes: -1

nosklo
nosklo

Reputation: 223032

lines = iter(fileinput.input(["file.dat"]))
next(lines) # extract and discard first line
for line in lines:
    data = proces_line(line)
    output(data)

or use the itertools.islice way if you prefer

import itertools
finput = fileinput.input(["file.dat"])
lines = itertools.islice(finput, 1, None) # cuts off first line
dataset = (process_line(line) for line in lines)
results = [output(data) for data in dataset]

Since everything used are generators and iterators, no intermediate list will be built.

Upvotes: 18

Phil
Phil

Reputation: 4867

The fileinput module contains a bunch of handy functions, one of which seems to do exactly what you're looking for:

for line in fileinput.input(["file.dat"]):
  if not fileinput.isfirstline():
    data = proces_line(line);
    output(data)

fileinput module documentation

Upvotes: 16

Related Questions