Reputation: 495
I have successfully retrieved synsets connected to a base synset via other semantic relations, as follows:
wn.synset('good.a.01').also_sees()
Out[63]:
[Synset('best.a.01'),
Synset('better.a.01'),
Synset('favorable.a.01'),
Synset('good.a.03'),
Synset('obedient.a.01'),
Synset('respectable.a.01')]
wn.synset('good.a.01').similar_tos()
Out[64]:
[Synset('bang-up.s.01'),
Synset('good_enough.s.01'),
Synset('goodish.s.01'),
Synset('hot.s.15'),
Synset('redeeming.s.02'),
Synset('satisfactory.s.02'),
Synset('solid.s.01'),
Synset('superb.s.02'),
Synset('well-behaved.s.01')]
However, the antonym relation seems different. I managed to retrieve the lemma connected to my base synset, but was not able to retrieve the actual synset, like so:
wn.synset('good.a.01').lemmas()[0].antonyms()
Out[67]: [Lemma('bad.a.01.bad')]
How can I get the synset, and not the lemma, that is connected via antonymy to my base synset - wn.synset('good.a.01') ? TIA
Upvotes: 2
Views: 876
Reputation: 122082
For some reason, WordNet indexes antonymy
relations at the Lemma level instead of the Synset (see http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=good&i=8&h=00001000000000000000000000000000#c), so the question is whether Synsets
and Lemmas
have many-to-many or one-to-one relations.
In the case of ambiguous words, one word many meaning, we have a one-to-many relation between String-to-Synset
, e.g.
>>> wn.synsets('dog')
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]
In the case of one meaning/concept, multiple representation, we have a one-to-many relation between Synset
-to-String (where String refers to Lemma names):
>>> dog = wn.synset('dog.n.1')
>>> dog.definition()
u'a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds'
>>> dog.lemma_names()
[u'dog', u'domestic_dog', u'Canis_familiaris']
Note: up till now, we are comparing the relationships between String and Synsets
not Lemmas
and Synsets
.
The "cute" thing is that Lemma
and String has a one-to-one relationship:
>>> wn.synsets('dog')
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]
>>> wn.synsets('dog')[0]
Synset('dog.n.01')
>>> wn.synsets('dog')[0].definition()
u'a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds'
>>> wn.synsets('dog')[0].lemmas()
[Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), Lemma('dog.n.01.Canis_familiaris')]
>>> wn.synsets('dog')[0].lemmas()[0]
Lemma('dog.n.01.dog')
>>> wn.synsets('dog')[0].lemmas()[0].name()
u'dog'
The _name
property of a Lemma
object returns a unicode string, not a list. From the code points: https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L202 and https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L444
And it seems like the Lemma has a one-to-one relation with Synset. From docstring at https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L220:
Lemma attributes, accessible via methods with the same name::
- name: The canonical name of this lemma.
- synset: The synset that this lemma belongs to.
- syntactic_marker: For adjectives, the WordNet string identifying the syntactic position relative modified noun. See: http://wordnet.princeton.edu/man/wninput.5WN.html#sect10 For all other parts of speech, this attribute is None.
- count: The frequency of this lemma in wordnet.
So we can do this and somehow know that each Lemma
object is only going to return us 1 synset:
>>> wn.synsets('dog')[0].lemmas()[0]
Lemma('dog.n.01.dog')
>>> wn.synsets('dog')[0].lemmas()[0].synset()
Synset('dog.n.01')
Assuming that you are trying to do some sentiment analysis and you need the antonyms of every adjective in WordNet, you can easily do this to accept the Synsets of the antonyms:
>>> from nltk.corpus import wordnet as wn
>>> all_adj_in_wn = wn.all_synsets(pos='a')
>>> def get_antonyms(ss):
... return set(chain(*[[a.synset() for a in l.antonyms()] for l in ss.lemmas()]))
...
>>> for ss in all_adj_in_wn:
... print ss, ':', get_antonyms(ss)
...
Synset('unable.a.01') : set([Synset('unable.a.01')])
Upvotes: 1