Boxuan
Boxuan

Reputation: 5157

Python scan file line by line and remove last line in the same loop

I am trying to scan a csv file and make adjustments line by line. In the end, I would like to remove the last line. How can I remove the last line within the same scanning loop?

My code below reads from the original file, makes adjustments and finally writes to a new file.

import csv

raw_data = csv.reader(open("original_data.csv", "r"), delimiter=",")
output_data = csv.writer(open("final_data.csv", "w"), delimiter=",")
lastline = # integer index of last line

for i, row in enumerate(raw_data):
    if i == 10:
        # some operations
        output_data.writerow(row)
    elif i > 10 and i < lastline:
        # some operations
        output_data.writerow(row)
    elif i == lastline:
        output_data.writerow([])
    else:
        continue

Upvotes: 1

Views: 799

Answers (5)

georg
georg

Reputation: 214959

A variation of @Kolmar's idea:

def all_but_last(it):
    buf = next(it)
    for item in it:
        yield buf
        buf = item

for line in all_but_last(...):

Here's more generic code that extends islice (two-args version) for negative indexes:

import itertools, collections

def islice2(it, stop):
    if stop >= 0:
        for x in itertools.islice(it, stop):
            yield x
    else:
        d = collections.deque(itertools.islice(it, -stop))
        for item in it:
            yield d.popleft()
            d.append(item)


for x in islice2(xrange(20), -5):
    print x,

# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Upvotes: 2

ovgolovin
ovgolovin

Reputation: 13410

You can iterate with window of size 2 and print only the first value in the window. This will lead to the last element being skipped:

from itertools import izip, tee

def pairwise(iterable):
    a, b = itertools.tee(iterable)
    next(b, None)
    return izip(a, b)

for row, _ in pairwise(raw_data):
    output_data.writerow(row)

output_data.writerow([])

Upvotes: 1

Maciej Gol
Maciej Gol

Reputation: 15854

Instead of writing the current line each loop iteration, try writing the previously read line:

import csv

raw_data = csv.reader(open("original_data.csv", "r"), delimiter=",")
output_data = csv.writer(open("final_data.csv", "w"), delimiter=",")
last_iter = (None, None)

try:
    last_iter = (0, raw_data.next())
except StopIteration:
    # The file is empty
    pass
else:
    for new_row in raw_data:
        i, row = last_iter
        last_iter = (i + 1, new_row)

        if i == 10:
            # some operations
            output_data.writerow(row)
        elif i > 10:
            # some operations
            output_data.writerow(row)

    # Here, the last row of the file is in the `last_iter` variable.
    # It won't get written into the output file.
    output_data.writerow([])

Upvotes: 0

Kolmar
Kolmar

Reputation: 14224

You can make a generator to yield all elements except the last one:

def remove_last_element(iterable):
    iterator = iter(iterable)
    try:
        prev = next(iterator)
        while True:
            cur = next(iterator)
            yield prev
            prev = cur
    except StopIteration:
        return

Then you just wrap raw_data in it:

for i, row in enumerate(remove_last_element(raw_data)):
    # your code

The last line will be ignored automatically.

This approach has the benefit of only reading the file once.

Upvotes: 4

user1267259
user1267259

Reputation: 801

An idea is to calculate the length of each line you iterate and then when coming to the last line truncate the file thus "shortening the file". Not sure if this is good practice though...

eg Python: truncate a file to 100 lines or less

Upvotes: 0

Related Questions