Bluetail
Bluetail

Reputation: 1291

Converting a loop into a list comprehension to get IndexError: list index out of range

I would like to understand how a list comprehension would work here.

I have this loop and it works.


    token = nltk.word_tokenize(doc)
    # add parts of speech to token
    pos = nltk.pos_tag(token)
    nsets = []
    for w, p in pos:
        s = wn.synsets(w, convert_tag(p))
        if s:
            nsets.append(s[0])
        else:
            continue

however, when I try to do a list comprehension like this

nsets = [s[0] for w, p in pos if s == wn.synsets(w, convert_tag(p))]

I get

IndexError                                Traceback (most recent call last)
<ipython-input-26-406837792edd> in <module>()
----> 1 doc_to_synsets('Tom loves to play petanque')

<ipython-input-25-1eca09bded8e> in doc_to_synsets(doc)
     44             continue
     45 
---> 46     nsets = [s[0] for w, p in pos if s == wn.synsets(w, convert_tag(p))]
     47 
     48     nltk2wordnet = [(i[0], convert_tag(i[1])) for i in pos]

<ipython-input-25-1eca09bded8e> in <listcomp>(.0)
     44             continue
     45 
---> 46     nsets = [s[0] for w, p in pos if s == wn.synsets(w, convert_tag(p))]
     47 
     48     nltk2wordnet = [(i[0], convert_tag(i[1])) for i in pos]

IndexError: list index out of range

I have tried to add len(s[0])>0 and len(s)> at the end of the list comrehension like I have seen in similar questions but it did not help.. thank you.

Upvotes: 0

Views: 76

Answers (3)

maciejwww
maciejwww

Reputation: 1196

Since Python 3.8, you can use walrus operator (:=):

nsets = [s[0] for w, p in pos if (s := wn.synsets(w, convert_tag(p)))]

Upvotes: 1

If you absolutely want to use list comprehension here. You need to fix how s is never declared. There's also no good way to declare s in this instance, so you have to call wn.synsets(w, convert_tag(p)) twice.

synsets = [wn.synsets(w, convert_tag(p))[0] for w, p in pos if wn.synsets(w, convert_tag(p))]

But since you are calling that same function twice, the list comprehension is going to be slower than the original code.

The question becomes, do you want to save memory by never declaring s or do you want faster code by only having to run wn.synsets(w, convert_tag(p)) one time? Usually in the grand scheme of things, the single extra temporary variable is the better option as that has a defined footprint where as the double function call will have a exponential scale to it.

Upvotes: 3

AriesNinja
AriesNinja

Reputation: 55

All Lists Indexing Begin With 0, So If You Have 23 Items In A List, Your Last Item Is Item # 22.

Upvotes: 1

Related Questions