vedant
vedant

Reputation: 1

StopIteration error/ exception/ bug while using multiple if condition and next()

I used jupyter notebook here.

This code is from a youtube video. It was working in the youtuber's computer but mine raise a Stopiteration error

Here I am trying to get all the titles(questions from the csv) that are questions related to 'Go' language

import pandas as pd

df = pd.read_csv("Questions.csv", encoding = "ISO-8859-1", usecols = ["Title", "Id"])

titles = [_ for _ in df.loc[lambda d: d['Title'].str.lower().str.contains(" go "," golang ")]['Title']]

#new cell

import spacy

nlp = spacy.load("en_core_web_sm" , disable= ["ner"])

#new cell

def has_golang(text):
    doc = nlp(text)
    for t in doc:    
        if t.lower_ in [' go ', 'golang']:
            if t.pos_ != 'VERB':
                if t.dep_ == 'pobj':
                    return True
    return False

g = (title for title in titles if has_golang(title))
[next(g) for i in range(10)]

#This is the error

StopIteration                             Traceback (most recent call last)
<ipython-input-56-862339d10dde> in <module>
      9 
     10 g = (title for title in titles if has_golang(title))
---> 11 [next(g) for i in range(10)]

<ipython-input-56-862339d10dde> in <listcomp>(.0)
      9 
     10 g = (title for title in titles if has_golang(title))
---> 11 [next(g) for i in range(10)]

StopIteration: 

As far as I have done the research I think it might be a bug

All I want to do is get those titles that satisfy the 3 'if' conditions

link to the youtube video

Upvotes: -1

Views: 1277

Answers (1)

Michael Ruth
Michael Ruth

Reputation: 3504

The StopIteration is the result of calling next() on an exhausted iterator, i.e. g produces less than 10 results. You can get this information from the help() function.

help(next)
Help on built-in function next in module builtins:
next(...)
    next(iterator[, default])
    
    Return the next item from the iterator. If default is given and the iterator
    is exhausted, it is returned instead of raising StopIteration.

Edit

Your has_golang is incorrect. The first test is always False because nlp tokenizes words, i.e. trims the leading and trailing spaces. Try this:

def has_golang(text):
    doc = nlp(text)
    for t in doc:    
        if t.lower_ in ['go', 'golang']:
            if t.pos_ != 'VERB':
                if t.dep_ == 'pobj':
                    return True
    return False

I figured this out by finding a title which should result in True from has_golang. I then ran the following code:

doc = nlp("Making a Simple FileServer with Go and Localhost Refused to Connect")
print("\n".join(str((t.lower_, t.pos_, t.dep_)) for t in doc))
('making', 'VERB', 'csubj')
('a', 'DET', 'det')
('simple', 'PROPN', 'compound')
('fileserver', 'PROPN', 'dobj')
('with', 'ADP', 'prep')
('go', 'PROPN', 'pobj')
('and', 'CCONJ', 'cc')
('localhost', 'PROPN', 'conj')
('refused', 'VERB', 'ROOT')
('to', 'PART', 'aux')
('connect', 'VERB', 'xcomp')

Then looking at ('go', 'PROPN', 'pobj'), it's obvious that PROPN is not VERB, and pobj is pobj, so the issue has to be with the token: go, specifically "go" not " go ".


Original Response

If you just want the titles that satisfy the 3 if conditions, skip the generator:

g = list(filter(has_golang, titles))

If you need the generator but also want a list:

g = (title for title in titles if has_golang(title))
list(g)

Upvotes: 2

Related Questions