user1946217
user1946217

Reputation: 1753

wordet synset in python

I am using wordnet.synset() function in my code

>>> cb = wordnet.synset('fever.n.01')
>>> cb
Synset('fever.n.01')

>>> cb = wordnet.synset('disbelieve.n.01')

Traceback (most recent call last):
  File "<pyshell#60>", line 3, in <module>
    cb = wordnet.synset('disbelieve.n.01')
  File "C:\Python27\lib\site-packages\nltk\corpus\reader\wordnet.py", line 1016, in synset
    raise WordNetError(message % (lemma, pos))
WordNetError: no lemma 'disbelieve' with part of speech 'n'

>>> cb = wordnet.synset('disbelieve.v.01')
>>> cb
Synset('disbelieve.v.01')

'disbelieve.v.01' exists in wordnet. But the nltk.pos_tag tags it as noun.

>>> import nltk
>>> tagged = nltk.pos_tag('disbelieve')
>>> tagged
[('disbelieve', 'NN')]

Going forward I will be using wordnet's synset similarity function. I do not want to check for pos tags as there is a fair chance of above mentioned error.

So I would want to know if there is any function in nltk that checks if a word beginning with(say, 'disbelieve') exists in wordnet then get the complete wordnet stored form of the word(i.e. 'disbelieve.v.01')

Upvotes: 0

Views: 2036

Answers (1)

Jared
Jared

Reputation: 26397

You can get a list of the synsets for any given word by doing,

>>> from nltk.corpus import wordnet as wn
>>> wn.synsets('disbelieve')
[Synset('disbelieve.v.01')]

An idea might be to favor those synsets where the POS tag matches that of your tagger, otherwise just pick the first of the list.

Upvotes: 2

Related Questions