NLTK jumping synset names - python

Question

From the NLTK WordNet API:

>>> from nltk.corpus import wordnet as wn
>>> for i in wn.synsets('discover'):
...     print i, i.offset
... 
Synset('detect.v.01') 2154508
Synset('learn.v.02') 598954
Synset('discover.v.03') 1637982
Synset('discover.v.04') 721437
Synset('fall_upon.v.01') 2286687
Synset('unwrap.v.02') 933821
Synset('discover.v.07') 2128066
Synset('identify.v.05') 652346

>>> wn.synset('discover.v.8')
Synset('identify.v.05')

From the index.verb file from WN3.0, we have:

discover v 8 6 @ ~ * > $ + 8 7 02154508 00598954 01637982 00721437 02286687 00933821 02128066 00652346

I have checked the WordNet API (http://www.nltk.org/_modules/nltk/corpus/reader/wordnet.html) but there isn't much to say how the mapping from discover.v.8 to identify.v.5.

Can anyone explain how did the mapping occur?

How can I extract a list of these mapping?

NLTK jumping synset names - python

Answers (1)

Related Questions