Reputation: 1548
I'm using nltk and wordnet to link words which belongs to some group of relations. For example 'parking' and 'building' should have some parent linkage. I use hypernyms but for some words there are no connection.
x = wordnet.synset('parking.n.01')
y = wordnet.synset('building.n.01')
print(x._shortest_hypernym_paths(y))
print(y._shortest_hypernym_paths(x))
{Synset('parking.n.01'): 0, Synset('room.n.02'): 1, Synset('position.n.07'): 2, Synset('relation.n.01'): 3, Synset('abstraction.n.06'): 4, Synset('entity.n.01'): 5, Synset('ROOT'): 6} {Synset('building.n.01'): 0, Synset('structure.n.01'): 1, Synset('artifact.n.01'): 2, Synset('whole.n.02'): 3, Synset('object.n.01'): 4, Synset('physical_entity.n.01'): 5, Synset('entity.n.01'): 6, Synset('ROOT'): 7}
Here, the connection goes through 'entity.n.01' which honestly is the root for almost all physical nouns. How can I get something better than this?
I'd like to get something like 'parking' -> 'structure' -> 'building'; it can be longer but "alien" words shouldn't be up in there, like for example 'monkey' which also zips to entity.
Upvotes: 4
Views: 309
Reputation: 1548
Found some helpful way to view possibilities:
def getShortestHypernymPath(word1, word2, nulls=False):
syns1 = wordnet.synsets(word1)
syns2 = wordnet.synsets(word2)
for s1 in syns1:
for s2 in syns2:
lch = s2.lowest_common_hypernyms(s1)
if len(lch) > 0 or nulls:
print(s1, '<-->', s2, '===', lch)
nlpf.getShortestHypernymPath('parking', 'building', nulls=False)
This returns:
Synset('parking.n.01') <--> Synset('building.n.01') === [Synset('entity.n.01')] Synset('parking.n.01') <--> Synset('construction.n.01') === [Synset('abstraction.n.06')] Synset('parking.n.01') <--> Synset('construction.n.07') === [Synset('abstraction.n.06')] Synset('parking.n.01') <--> Synset('building.n.04') === [Synset('abstraction.n.06')] Synset('parking.n.02') <--> Synset('building.n.01') === [Synset('entity.n.01')] Synset('parking.n.02') <--> Synset('construction.n.01') === [Synset('act.n.02')] Synset('parking.n.02') <--> Synset('construction.n.07') === [Synset('act.n.02')] Synset('parking.n.02') <--> Synset('building.n.04') === [Synset('abstraction.n.06')] Synset('park.v.02') <--> Synset('build.v.05') === [Synset('control.v.01')]
so I can at least mediate on it.
Upvotes: 2