windboy
windboy

Reputation: 133

ZeroDivisionError , but I can't find the error

I have a small encountered a zero error but I can't find it. My intention is to compare a text file which contains these words.

secondly
pardon
woods
secondly

I wrote the script to compare the two values this way:

secondly, pardon
secondly, woods
secondly, secondly
pardon, woods
pardon, secondly
woods, secondly

My code does the following:

1) if words are the same it will give a score of 1 otherwise it is a score calculated by the gensim vector model 2) there is a counter and the counter will reset when the first for loop moves to the next word. Eg, secondly,pardon > secondly, woods > secondly, secondly ( at this point the count is 3)

The code

from __future__ import division
import gensim


textfile = 'businessCleanTxtUniqueWords'
model = gensim.models.Word2Vec.load("businessSG")
count = 0  # keep track of counter
score = 0
avgScore = 0
SentenceScore = 0
externalCount = 0
totalAverageScore = 0

with open(textfile, 'r+') as f1:

    words_list = f1.readlines()

    for each_word in words_list:
        word = each_word.strip()

        for each_word2 in words_list[words_list.index(each_word) + 1:]:
            count = count + 1

            try:
                word2 = each_word2.strip()
                print(word, word2)
                # if words are the same
                if (word == word2):
                    score = 1
                else:
                    score = model.similarity(word,word2) # when words are not the same
            # if word is not in vector model
            except KeyError:
                score = 0
            # to keep track of the score
            SentenceScore=SentenceScore + score

            print("the score is: " + str(score))
            print("the count is: " + str(count))
        # average score
        avgScore = round(SentenceScore / count,5)

        print("the avg score: " + str(SentenceScore) + '/' + str(count) + '=' + str(avgScore))
        # reset counter and sentence score
        count = 0
        SentenceScore = 0

The error message:

Traceback (most recent call last):
  File "C:/Users/User/Desktop/Complete2/Complete/TrainedTedModel/LatestJR.py", line 41, in <module>
    avgScore = round(SentenceScore / count,5)
ZeroDivisionError: division by zero
('secondly', 'pardon')
the score is: 0.180233083443
the count is: 1
('secondly', 'woods')
the score is: 0.181432347816
the count is: 2
('secondly', 'secondly')
the score is: 1
the count is: 3
the avg score: 1.36166543126/3=0.45389
('pardon', 'woods')
the score is: 0.405021005657
the count is: 1
('pardon', 'secondly')
the score is: 0.180233083443
the count is: 2
the avg score: 0.5852540891/2=0.29263
('woods', 'secondly')
the score is: 0.181432347816
the count is: 1
the avg score: 0.181432347816/1=0.18143

I have included " from __future__ import division " for the division but it does not seem to fix it

My files can be found in the following link:

Gensim model:

https://entuedu-my.sharepoint.com/personal/jseng001_e_ntu_edu_sg/_layouts/15/guestaccess.aspx?guestaccesstoken=BlORQpsmI6RMIja55I%2bKO9oF456w5tBLR43XZdVCQIA%3d&docid=00459c024d33d48638508dd331cf73144&rev=1&expiration=2016-11-25T23%3a56%3a48.000Z

Textfile:

https://entuedu-my.sharepoint.com/personal/jseng001_e_ntu_edu_sg/_layouts/15/guestaccess.aspx?guestaccesstoken=7%2b8Nkm9BySPFR0zqD%2fdgUcYOaXREG3%2fycALnMFcv59A%3d&docid=08158c442c3f74970bc8090f253b499f8&rev=1&expiration=2016-11-25T23%3a56%3a01.000Z

Thank you.

Upvotes: 0

Views: 832

Answers (2)

Tadhg McDonald-Jensen
Tadhg McDonald-Jensen

Reputation: 21453

the line that goes wrong is directly stated in the error message:

Traceback (most recent call last):
  File "C:/Users/User/Desktop/Complete2/Complete/TrainedTedModel/LatestJR.py", line 41, in <module>
    avgScore = round(SentenceScore / count,5)
ZeroDivisionError: division by zero

so I'm going to assume that SentenceScore / count is the division in question so it is clear that count is 0, I would suggest right before that line you add something like:

print("SentenceScore is",SentenceScore, "and count is",count)

so you can see this for yourself, now since the inner loop:

for each_word2 in words_list[words_list.index(each_word) + 1:]: count = count + 1

is the only thing that adds to count and count is being reset to zero at the end of each iteration of the outer loop that would mean that the inner loop is not running at all at some point, meaning that words_list[words_list.index(each_word) + 1:] is an empty sequence. This will happen when each_word is the last word in words_list.

Upvotes: 1

acw1668
acw1668

Reputation: 46669

It is because the first for loop has reached the last word and the second for loop will not be executed and so the count equals to zero (reset to zero in last iteration). Just change the first for loop to ignore the last word (since it is not necessary):

for each_word in words_list[:-1]:

Upvotes: 1

Related Questions