Ha Hacker
Ha Hacker

Reputation: 407

Removing first line of Big CSV file?

How should I remove the first line of a big CSV file in python? I looked into the previous solutions in here, one was:

with open("test.csv",'r') as f:
    with open("updated_test.csv",'w') as f1:
        f.next() # skip header line
        for line in f:
            f1.write(line)

which gave me this error:

f.next() # skip header line
AttributeError: '_io.TextIOWrapper' object has no attribute 'next'

the other solution was:

with open('file.txt', 'r') as fin:
    data = fin.read().splitlines(True)
with open('file.txt', 'w') as fout:
    fout.writelines(data[1:])

Which brings memory issue!

Upvotes: 7

Views: 10630

Answers (3)

Alex
Alex

Reputation: 6037

Using sed is probably the fastest and doesn't require a temp file, so a python wrapper would be:

import subprocess

def delete_first_lines(filename, line_nums):
    n = '1,{}d'.format(line_nums)
    subprocess.Popen(['sed', '-i', n, filename ],
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT
        )

Upvotes: 0

aberna
aberna

Reputation: 5814

use the f.__next__() instead of f.next()

documentation here: https://docs.python.org/3/library/stdtypes.html#iterator.next

Upvotes: 0

Hackaholic
Hackaholic

Reputation: 19733

Replace f.next() to next(f)

with open("test.csv",'r') as f, open("updated_test.csv",'w') as f1:
    next(f) # skip header line
    for line in f:
        f1.write(line)

Upvotes: 11

Related Questions