nish
nish

Reputation: 325

remove() function for list not working

I have written a python script to calculate semantic similarity between words in a set. Based on that I want to remove words that are not strongly correlated to others. Following is the code for removing a word from the set.

line_combined=copy(line1)
threshold = 1/len(line_combined)
for word3 in line_combined:
    print("simdict[" + word3 + "] =" + str(simdict[word3]))
    print ("ratio is: " + str(simdict[word3]/linesumsim))
    if(simdict[word3]/linesumsim)<threshold:
        line_combined.remove(word3)
        print word3 + " is removed"
print "the output is:"
print line_combined

"line1" is the set of words under consideration, used as list. "simdict[word]" holds the sum of similarities of "word" with rest of the words in the set. "linesumsim" is the sum of all "simdict" value for all the words in the set.

Output is:

linesumsim is 2.82012427883
simdict[city] =0.517357507497
ratio is: 0.183452024217
simdict[mountain] =0.642265108364
ratio is: 0.227743547752
simdict[sky] =0.484908130427
ratio is: 0.171945660007
simdict[sun] =0.637289239227
ratio is: 0.225979132909
simdict[characteristics] =0.538304293319
ratio is: 0.190879635114
the output is:
['city', 'mountain', 'sky', 'sun', 'characteristics']

Clearly there are words with simdict value less that threshold, 0.2 in this case. But they are not getting removed

Upvotes: 2

Views: 145

Answers (1)

Henrik Andersson
Henrik Andersson

Reputation: 47172

You can't remove while iterating over the same list you want to remove from.

Change

for word3 in line_combined:

to

for word3 in line1:

Upvotes: 1

Related Questions