Behzat
Behzat

Reputation: 7

Stopwords Removal with Python

I do not understand why this code does not work. When I click the run, it says "After stopwords removal: None". Can anyone assist how to fix the problem ? Many Thanks.

 stop_words = ["the", "of", "a", "to", "be", "from", "or"]
 last = lower_words.split()

 for i in stop_words:
     lastone = last.remove(i)
     print "\nAAfter stopwords removal:\n",lastone

Upvotes: 0

Views: 2469

Answers (2)

panomi
panomi

Reputation: 21

Here is a function that receives a text and returns the text without the stopword. It achieves its goal by ignoring every word in a dictionary stopwords. I use .lower() function for each word i because most of stopwords packages are on lowercase letter but our text may be not.

def cut_stop_words(text,stopwords):
  new_text= ''
  for i in text.split():

    if (i.lower()) in stopwords:
         pass
     else:
         new_text= new_text.strip() + ' ' + i

  return new_text

Upvotes: 0

Andrew Clark
Andrew Clark

Reputation: 208405

The list.remove() function modifies the list in place and returns None.

So when you do last.remove(i), it will remove the first occurrence of i from the list last and return None, so lastone will always be set to None.

For what you are trying to do, you probably want all occurrences of an item from stop_words removed so last.remove() will not be the most efficient method. Instead, I would do something like the following with a list comprehension:

stop_words = set(["the", "of", "a", "to", "be", "from", "or"])
last = lower_words.split()
last = [word for word in last if word not in stop_words]

Converting stop_words to a set is to make this more efficient, but you would get the same behavior if you left it as a list.

And for completeness, here is how you would need to do this with remove():

stop_words = ["the", "of", "a", "to", "be", "from", "or"]
last = lower_words.split()
for word in stop_words:
    try:
        while True:
            last.remove(word)
    except ValueError:
        pass

Upvotes: 2

Related Questions