Reputation: 5157
I am trying to scan a csv file and make adjustments line by line. In the end, I would like to remove the last line. How can I remove the last line within the same scanning loop?
My code below reads from the original file, makes adjustments and finally writes to a new file.
import csv
raw_data = csv.reader(open("original_data.csv", "r"), delimiter=",")
output_data = csv.writer(open("final_data.csv", "w"), delimiter=",")
lastline = # integer index of last line
for i, row in enumerate(raw_data):
if i == 10:
# some operations
output_data.writerow(row)
elif i > 10 and i < lastline:
# some operations
output_data.writerow(row)
elif i == lastline:
output_data.writerow([])
else:
continue
Upvotes: 1
Views: 799
Reputation: 214959
A variation of @Kolmar's idea:
def all_but_last(it):
buf = next(it)
for item in it:
yield buf
buf = item
for line in all_but_last(...):
Here's more generic code that extends islice
(two-args version) for negative indexes:
import itertools, collections
def islice2(it, stop):
if stop >= 0:
for x in itertools.islice(it, stop):
yield x
else:
d = collections.deque(itertools.islice(it, -stop))
for item in it:
yield d.popleft()
d.append(item)
for x in islice2(xrange(20), -5):
print x,
# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Upvotes: 2
Reputation: 13410
You can iterate with window of size 2 and print only the first value in the window. This will lead to the last element being skipped:
from itertools import izip, tee
def pairwise(iterable):
a, b = itertools.tee(iterable)
next(b, None)
return izip(a, b)
for row, _ in pairwise(raw_data):
output_data.writerow(row)
output_data.writerow([])
Upvotes: 1
Reputation: 15854
Instead of writing the current line each loop iteration, try writing the previously read line:
import csv
raw_data = csv.reader(open("original_data.csv", "r"), delimiter=",")
output_data = csv.writer(open("final_data.csv", "w"), delimiter=",")
last_iter = (None, None)
try:
last_iter = (0, raw_data.next())
except StopIteration:
# The file is empty
pass
else:
for new_row in raw_data:
i, row = last_iter
last_iter = (i + 1, new_row)
if i == 10:
# some operations
output_data.writerow(row)
elif i > 10:
# some operations
output_data.writerow(row)
# Here, the last row of the file is in the `last_iter` variable.
# It won't get written into the output file.
output_data.writerow([])
Upvotes: 0
Reputation: 14224
You can make a generator to yield all elements except the last one:
def remove_last_element(iterable):
iterator = iter(iterable)
try:
prev = next(iterator)
while True:
cur = next(iterator)
yield prev
prev = cur
except StopIteration:
return
Then you just wrap raw_data
in it:
for i, row in enumerate(remove_last_element(raw_data)):
# your code
The last line will be ignored automatically.
This approach has the benefit of only reading the file once.
Upvotes: 4
Reputation: 801
An idea is to calculate the length of each line you iterate and then when coming to the last line truncate the file thus "shortening the file". Not sure if this is good practice though...
eg Python: truncate a file to 100 lines or less
Upvotes: 0