mmanol
mmanol

Reputation: 13

Delete line that contains a string in a txt file python

I'm trying to delete a line in a txt file which contains a variable (email).

I want to delete the whole line which contains the email e.g. [email protected] not just the variable This is what I've come up with so far but it doesn't seem to work.

with open("wappoint.txt.txt", "r") as w:
    lines = w.readlines()
with open("wappoint.txt.txt", "w") as w:
    for line in lines:
        if email.strip("\n") != email:
            w.write(line)

The contents of the txt file are

[email protected], 1
[email protected], 3

Upvotes: 1

Views: 1782

Answers (3)

Pierre D
Pierre D

Reputation: 26201

There are a number of considerations to address about this:

  1. If your file is large, it isn't a good idea to load it all in memory.
  2. If some exception occurs during processing (maybe even a KeyboardInterrruptException), it is often desirable to leave your original file untouched (so, we'll try to make your operation ACID).
  3. If multiple concurrent processes try to modify your file, you would like some guarantee that, at least, yours is safe (also ACID).
  4. You may (or may not) want a backup for your file.

There are a number of possibilities (see e.g. this SO question). In my experience however, I got mixed results with fileinput: it makes it easy to modify one or several files in place, optionally creating a backup for each, but unfortunately it writes eagerly in each file (possibly leaving it incomplete when encountering an exception). I put an example at the end for reference.

What I've found to be the simplest and safest approach is to use a temporary file (in the same directory as the file you are processing and named uniquely but in a recognizable manner), do your operation from src to tmp, then mv tmp src which, at least for practical purposes, is atomic on most POSIX filesystems.

def acceptall(line):
    return True

def filefilter(filename, filterfunc=acceptall, backup=None):
    if backup:
        backup = f'{filename}{backup}'  # leave None if no backup wanted
    tmpname = tempfile.mktemp(prefix=f'.{filename}-', dir=os.path.dirname(filename))
    with open(tmpname, 'w') as tmp, open(filename, 'r') as src:
        for line in src:
            if filterfunc(line):
                tmp.write(line)
    if backup:
        os.rename(filename, backup)
    os.rename(tmpname, filename)

Example for your case:

filefilter('wappoint.txt.txt', lambda line: email not in line)

Using a regex to exclude multiple email addresses (case-insensitive and only fully matching), and generating a .bak backup file:

matcher = re.compile(r'.*\b(bob|fred|jeff)@foo\.com\b', re.IGNORECASE)
filefilter(filename, lambda line: not matcher.match(line), backup='.bak')

We can also simulate what happens if an exception is raised in the middle (e.g. on the first matching line):

def flaky(line):
    if email in line:
        1 / 0
    return True

filefilter(filename, flaky)

That will raise ZeroDivisionError upon the first matching line. But notice how your file is not modified at all in that case (and no backup is made). As a side-effect, the temporary file remains (this is consistent with other utils, e.g. rsync, that leave .filename-<random> incomplete temp files at the destination when interrupted).


As promised, here is also an example using fileinput, but with the caveats explained earlier:

with fileinput.input(filename, inplace=True, backup='.bak') as f:
    for line in f:
        if email not in line:
            print(line, end='')  # this prints back to filename

Upvotes: 1

ppwater
ppwater

Reputation: 2277

Are you looking for this?:

with open("wappoint.txt", "r") as w:
    lines = w.readlines()
with open("wappoint.txt", "w") as w:
    for line in lines:
        if email not in line:
            w.write(line)

this removes the line if it contains the email.

Upvotes: 2

costaparas
costaparas

Reputation: 5237

It seems like you just want to check if email occurs in the line.

Your code is trying to do an (in)equality comparison - when you should instead be checking for a substring (i.e. whether email occur in line).

A suitable condition is:

if email not in line:

Upvotes: 1

Related Questions