pallavi gupta
pallavi gupta

Reputation: 191

How to remove words from a list in python

I have done my code this far but it is not working properly with remove()..can anyone help me..

'''
Created on Apr 21, 2015

@author: Pallavi
'''
from pip._vendor.distlib.compat import raw_input
print ("Enter Query")
str=raw_input()  

fo = open("stopwords.txt", "r+")
str1 = fo.read();
list=str1.split("\n");
fo.close()
words=str.split(" ");
for i in range(0,len(words)):
    for j in range(0,len(list)):
        if(list[j]==words[i]):
            print(words[i])
            words.remove(words(i))

Here is the error:

Enter Query
let them cry try diesd
let them try
Traceback (most recent call last):
  File "C:\Users\Pallavi\workspace\py\src\parser.py", line 17, in <module>
    if(list[j]==words[i]):
IndexError: list index out of range

Upvotes: 14

Views: 92486

Answers (4)

Neha
Neha

Reputation: 11

one more easy way to remove words from the list is to convert 2 lists into the set and do a subtraction btw the list.

words = ['a', 'b', 'a', 'c', 'd']
words = set(words)
stopwords = ['a', 'c']
stopwords = set(stopwords)
final_list = words - stopwords
final_list = list(final_list)

Upvotes: 1

Sorin Dragan
Sorin Dragan

Reputation: 540

As an observation, this could be another elegant way to do it:

new_words = list(filter(lambda w: w not in stop_words, initial_words))

Upvotes: 9

dotbit
dotbit

Reputation: 5055

''' call this script in a Bash Konsole like so:    python  reject.py
    purpose of this script: remove certain words from a list of words ,
    e.g. remove invalid packages in a request-list using 
    a list of rejected packages from the logfile, 
    say on https://fai-project.org/FAIme/#
    remove trailing spaces e.g. with KDE Kate in wordlist like so:

kate: remove-trailing-space on; BOM off;
'''
with open("rejects", "r+")       as fooo   :
    stwf    = fooo.read()
toreject    = stwf.split("\n")

with open("wordlist", "r+")      as bar    :
  woL       = bar.read()
words       = woL.split("\n")

new_words = [word for word in words if word not in toreject]
with open("cleaned", "w+")       as foobar :
    for ii in new_words:
        foobar.write("%s\n" % ii)

Upvotes: 3

Francis Colas
Francis Colas

Reputation: 3647

The errors you have (besides my other comments) are because you're modifying a list while iterating over it. But you take the length of the list at the start, thus, after you've removed some elements, you cannot access the last positions.

I would do it this way:

words = ['a', 'b', 'a', 'c', 'd']
stopwords = ['a', 'c']
for word in list(words):  # iterating on a copy since removing will mess things up
    if word in stopwords:
        words.remove(word)

An even more pythonic way using list comprehensions:

new_words = [word for word in words if word not in stopwords]

Upvotes: 41

Related Questions