Reputation: 231
I am trying to recycle this code from another source but I am having trouble understand the for
loop in the second line. Can someone please clarify what exactly this line title = [x for x in title if x not in stopWords]
is doing? stopWords
is a list of words.
def title_score(title, sentence):
title = [x for x in title if x not in stopWords]
count = 0.0
for word in sentence:
if (word not in stopWords and word in title):
count += 1.0
if len(title) == 0:
return 0.0
return count/len(title)
Upvotes: 1
Views: 72
Reputation: 881153
[x for x in title if x not in stopWords]
It's a list comprehension. It means construct a list of all items in title
(that's the x for x in title
bit) that are not also in stopWords
(per the if x not in stopWords
bit).
You can see a similar effect with the following snippets. The first creates a list of all number in the inclusive range 0..9
:
>>> [x for x in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
The second adds an if
clause to only include odd numbers:
>>> [x for x in range(10) if x % 2 != 0]
[1, 3, 5, 7, 9]
And here's perhaps a better example, more closely aligned to your code:
>>> stopWords = "and all but if of the".split() ; stopWords
['and', 'all', 'but', 'if', 'of', 'the']
>>> title = "the sum of all fears".split() ; title
['the', 'sum', 'of', 'all', 'fears']
>>> [x for x in title]
['the', 'sum', 'of', 'all', 'fears']
>>> [x for x in title if x not in stopWords]
['sum', 'fears']
There you can see the "noise" words being removed in the final step.
Upvotes: 2
Reputation: 35059
That is a list comprehension, equivalent to this loop:
newtitle = []
for x in title:
if x not in stopwords;
newtitle.append(x)
title = newtitle
In other words, it effectively removes any words from title
if they also appear in stopwords
.
Upvotes: 0
Reputation: 1368
well, they say that python is like runnable pseudocode and I guess that applies here. it is creating a list and putting into it every item inside title where that item is not inside stopWords
Upvotes: 0