Reputation: 407
How should I remove the first line of a big CSV file in python? I looked into the previous solutions in here, one was:
with open("test.csv",'r') as f:
with open("updated_test.csv",'w') as f1:
f.next() # skip header line
for line in f:
f1.write(line)
which gave me this error:
f.next() # skip header line
AttributeError: '_io.TextIOWrapper' object has no attribute 'next'
the other solution was:
with open('file.txt', 'r') as fin:
data = fin.read().splitlines(True)
with open('file.txt', 'w') as fout:
fout.writelines(data[1:])
Which brings memory issue!
Upvotes: 7
Views: 10630
Reputation: 6037
Using sed
is probably the fastest and doesn't require a temp file, so a python wrapper would be:
import subprocess
def delete_first_lines(filename, line_nums):
n = '1,{}d'.format(line_nums)
subprocess.Popen(['sed', '-i', n, filename ],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT
)
Upvotes: 0
Reputation: 5814
use the f.__next__() instead of f.next()
documentation here: https://docs.python.org/3/library/stdtypes.html#iterator.next
Upvotes: 0
Reputation: 19733
Replace f.next()
to next(f)
with open("test.csv",'r') as f, open("updated_test.csv",'w') as f1:
next(f) # skip header line
for line in f:
f1.write(line)
Upvotes: 11