Arif
Arif

Reputation: 43

Python - remove a word or matching string from a text file

I am trying to remove a word from a text file and found a code that seems to be working.

However, it doesn't match an exact word, instead removes all matching letters.

fin = open("input.txt")
fout = open("output.txt", "w+")
delete_list = input('delete : ')
for line in fin:
    for word in delete_list:
        line = line.replace(word, '')
    fout.write(line)
fin.close()
fout.close()
print ('done')

input.txt

http://www.google.co.ma
google.com.mm
https://google.mn
www.google.com.mt

Result for trying to remove http:// (only) is as following-

output.txt

www.google.co.ma
google.com.mm
sgoogle.mn
www.google.com.m

Upvotes: 2

Views: 5411

Answers (2)

Empiu
Empiu

Reputation: 11

I just started coding so don't know how ugly is this solution looks, but re module seems fine.

from re import sub
with open('test.txt') as f:
    file = f.read().split('\n')
for i in range(len(file)):
    file[i] = sub(r'http[s]?://', '', file[i])
#print(file)
with open('test1.txt', 'w') as f1:
    f1.writelines(["%s\n" % item  for item in file])

or if you don't want to use re module, you can use if statement instead

with open('test.txt') as f:
    file = f.read().split('\n')
for i in range(len(file)):
    if file[i].startswith('https://'):
        link = file[i]
        file[i] = link[8:]
    elif file[i].startswith('http://'):
        link = file[i]
        file[i] = link[7:]
#print(file)
with open('test1.txt', 'w') as f1:
    f1.writelines(["%s\n" % item  for item in file])

Upvotes: 1

Alex Lew
Alex Lew

Reputation: 2124

Let's take a look at what's happening here:

  1. You call input, which returns a string, "http://". You assign this to the variable delete_list.
  2. You loop through delete_list using a for loop. But note: delete_list is a string, not a list. When you use a for loop to iterate through a string, it loops through the letters of the string.
  3. You go through each letter and remove it from the line.

There are three things you could do to fix this:

  1. Change your assignment of delete_list to assign to a single-element list: delete_list = [input("word to delete: ")]

  2. Rename delete_list to more accurately reflect its true value, something like word_to_delete, then don't use a for loop -- just do line.replace(word_to_delete, '') directly.

  3. Use a loop to get a list of words from the user.

Hope that clears things up!

Upvotes: 1

Related Questions