shongyang low
shongyang low

Reputation: 77

Deleting a specific number of lines from text file using Python

Suppose I have a text file that goes like this:

AAAAAAAAAAAAAAAAAAAAA              #<--- line 1
BBBBBBBBBBBBBBBBBBBBB              #<--- line 2
CCCCCCCCCCCCCCCCCCCCC              #<--- line 3
DDDDDDDDDDDDDDDDDDDDD              #<--- line 4
EEEEEEEEEEEEEEEEEEEEE              #<--- line 5
FFFFFFFFFFFFFFFFFFFFF              #<--- line 6
GGGGGGGGGGGGGGGGGGGGG              #<--- line 7
HHHHHHHHHHHHHHHHHHHHH              #<--- line 8


Ignore "#<--- line...", it's just for demonstration


Assumptions


End Result
The end result should look like this:

CCCCCCCCCCCCCCCCCCCCC              #<--- line 3
DDDDDDDDDDDDDDDDDDDDD              #<--- line 4
EEEEEEEEEEEEEEEEEEEEE              #<--- line 5


Lines deleted: First 2 + Everything after the next 3 (i.e. after line 5)

Required
All Pythonic suggestions are welcome! Thanks!




Reference Material
https://thispointer.com/python-how-to-delete-specific-lines-in-a-file-in-a-memory-efficient-way/

def delete_multiple_lines(original_file, line_numbers):
    """In a file, delete the lines at line number in given list"""
    is_skipped = False
    counter = 0
    # Create name of dummy / temporary file
    dummy_file = original_file + '.bak'
    # Open original file in read only mode and dummy file in write mode
    with open(original_file, 'r') as read_obj, open(dummy_file, 'w') as write_obj:
        # Line by line copy data from original file to dummy file
        for line in read_obj:
            # If current line number exist in list then skip copying that line
            if counter not in line_numbers:
                write_obj.write(line)
            else:
                is_skipped = True
            counter += 1

    # If any line is skipped then rename dummy file as original file
    if is_skipped:
        os.remove(original_file)
        os.rename(dummy_file, original_file)
    else:
        os.remove(dummy_file)


Then...

delete_multiple_lines('sample.txt', [0,1,2])


The problem with this method might be that, if your file had 1-100 lines on top to delete, you'll have to specify [0,1,2...100]. Right?


Answer
Courtesy of @sandes

The following code will:


with open("sample.txt", "r") as f:
    lines = f.readlines()
    new_lines = []
    idx_lines_wanted = [x for x in range(63,((63*2)+95))]
    # delete first 63, then get the next 95
    for i, line in enumerate(lines):
        if i > len(idx_lines_wanted) -1:
            break
        if i in idx_lines_wanted:
             new_lines.append(line)

with open("sample2.txt", "w") as f:
    for line in new_lines:
        f.write(line)

Upvotes: 1

Views: 1963

Answers (2)

sandes
sandes

Reputation: 2267

EDIT: iterating directly over f

based in @Kenny's comment and @chepner's suggestion

with open("your_file.txt", "r") as f:
    new_lines = []
    for idx, line in enumerate(f):
        if idx in [x for x in range(2,5)]: #[2,3,4]
            new_lines.append(line)

with open("your_new_file.txt", "w") as f:
    for line in new_lines:
        f.write(line)

Upvotes: 3

chepner
chepner

Reputation: 531055

This is really something that's better handled by an actual text editor.

import subprocess

subprocess.run(['ed', original_file], input=b'1,2d\n+3,$d\nwq\n')

A crash course in ed, the POSIX standard text editor.

ed opens the file named by its argument. It then proceeds to read commands from its standard input. Each command is a single character, with some commands taking one or two "addresses" to indicate which lines to operate on.

After each command, the "current" line number is set to the line last affected by a command. This is used with relative addresses, as we'll see in a moment.

  • 1,2d means to delete lines 1 through 2; the current line is set to 2
  • +3,$d deletes all the lines from line 5 (current line is 2, so 2 + 3 == 5) through the end of the file ($ is a special address indicating the last line of the file)
  • wq writes all changes to disk and quits the editor.

Upvotes: 2

Related Questions