Reputation: 7
I do not understand why this code does not work. When I click the run, it says "After stopwords removal: None". Can anyone assist how to fix the problem ? Many Thanks.
stop_words = ["the", "of", "a", "to", "be", "from", "or"]
last = lower_words.split()
for i in stop_words:
lastone = last.remove(i)
print "\nAAfter stopwords removal:\n",lastone
Upvotes: 0
Views: 2469
Reputation: 21
Here is a function that receives a text and returns the text without the stopword. It achieves its goal by ignoring every word in a dictionary stopwords. I use .lower() function for each word i because most of stopwords packages are on lowercase letter but our text may be not.
def cut_stop_words(text,stopwords):
new_text= ''
for i in text.split():
if (i.lower()) in stopwords:
pass
else:
new_text= new_text.strip() + ' ' + i
return new_text
Upvotes: 0
Reputation: 208405
The list.remove()
function modifies the list in place and returns None
.
So when you do last.remove(i)
, it will remove the first occurrence of i
from the list last
and return None
, so lastone
will always be set to None
.
For what you are trying to do, you probably want all occurrences of an item from stop_words
removed so last.remove()
will not be the most efficient method. Instead, I would do something like the following with a list comprehension:
stop_words = set(["the", "of", "a", "to", "be", "from", "or"])
last = lower_words.split()
last = [word for word in last if word not in stop_words]
Converting stop_words
to a set is to make this more efficient, but you would get the same behavior if you left it as a list.
And for completeness, here is how you would need to do this with remove()
:
stop_words = ["the", "of", "a", "to", "be", "from", "or"]
last = lower_words.split()
for word in stop_words:
try:
while True:
last.remove(word)
except ValueError:
pass
Upvotes: 2