Reputation: 133
I have a small encountered a zero error but I can't find it. My intention is to compare a text file which contains these words.
secondly
pardon
woods
secondly
I wrote the script to compare the two values this way:
secondly, pardon
secondly, woods
secondly, secondly
pardon, woods
pardon, secondly
woods, secondly
My code does the following:
1) if words are the same it will give a score of 1 otherwise it is a score calculated by the gensim vector model 2) there is a counter and the counter will reset when the first for loop moves to the next word. Eg, secondly,pardon > secondly, woods > secondly, secondly ( at this point the count is 3)
The code
from __future__ import division
import gensim
textfile = 'businessCleanTxtUniqueWords'
model = gensim.models.Word2Vec.load("businessSG")
count = 0 # keep track of counter
score = 0
avgScore = 0
SentenceScore = 0
externalCount = 0
totalAverageScore = 0
with open(textfile, 'r+') as f1:
words_list = f1.readlines()
for each_word in words_list:
word = each_word.strip()
for each_word2 in words_list[words_list.index(each_word) + 1:]:
count = count + 1
try:
word2 = each_word2.strip()
print(word, word2)
# if words are the same
if (word == word2):
score = 1
else:
score = model.similarity(word,word2) # when words are not the same
# if word is not in vector model
except KeyError:
score = 0
# to keep track of the score
SentenceScore=SentenceScore + score
print("the score is: " + str(score))
print("the count is: " + str(count))
# average score
avgScore = round(SentenceScore / count,5)
print("the avg score: " + str(SentenceScore) + '/' + str(count) + '=' + str(avgScore))
# reset counter and sentence score
count = 0
SentenceScore = 0
The error message:
Traceback (most recent call last):
File "C:/Users/User/Desktop/Complete2/Complete/TrainedTedModel/LatestJR.py", line 41, in <module>
avgScore = round(SentenceScore / count,5)
ZeroDivisionError: division by zero
('secondly', 'pardon')
the score is: 0.180233083443
the count is: 1
('secondly', 'woods')
the score is: 0.181432347816
the count is: 2
('secondly', 'secondly')
the score is: 1
the count is: 3
the avg score: 1.36166543126/3=0.45389
('pardon', 'woods')
the score is: 0.405021005657
the count is: 1
('pardon', 'secondly')
the score is: 0.180233083443
the count is: 2
the avg score: 0.5852540891/2=0.29263
('woods', 'secondly')
the score is: 0.181432347816
the count is: 1
the avg score: 0.181432347816/1=0.18143
I have included " from __future__ import division
" for the division but it does not seem to fix it
My files can be found in the following link:
Gensim model:
Textfile:
Thank you.
Upvotes: 0
Views: 832
Reputation: 21453
the line that goes wrong is directly stated in the error message:
Traceback (most recent call last):
File "C:/Users/User/Desktop/Complete2/Complete/TrainedTedModel/LatestJR.py", line 41, in <module>
avgScore = round(SentenceScore / count,5)
ZeroDivisionError: division by zero
so I'm going to assume that SentenceScore / count
is the division in question so it is clear that count
is 0, I would suggest right before that line you add something like:
print("SentenceScore is",SentenceScore, "and count is",count)
so you can see this for yourself, now since the inner loop:
for each_word2 in words_list[words_list.index(each_word) + 1:]: count = count + 1
is the only thing that adds to count and count is being reset to zero at the end of each iteration of the outer loop that would mean that the inner loop is not running at all at some point, meaning that words_list[words_list.index(each_word) + 1:]
is an empty sequence. This will happen when each_word
is the last word in words_list
.
Upvotes: 1
Reputation: 46669
It is because the first for
loop has reached the last word and the second for
loop will not be executed and so the count
equals to zero (reset to zero in last iteration). Just change the first for
loop to ignore the last word (since it is not necessary):
for each_word in words_list[:-1]:
Upvotes: 1