Peter.k
Peter.k

Reputation: 1548

nltk: how to search connection between some words?

I'm using nltk and wordnet to link words which belongs to some group of relations. For example 'parking' and 'building' should have some parent linkage. I use hypernyms but for some words there are no connection.

x = wordnet.synset('parking.n.01')
y = wordnet.synset('building.n.01')

print(x._shortest_hypernym_paths(y))
print(y._shortest_hypernym_paths(x))

{Synset('parking.n.01'): 0, Synset('room.n.02'): 1, Synset('position.n.07'): 2, Synset('relation.n.01'): 3, Synset('abstraction.n.06'): 4, Synset('entity.n.01'): 5, Synset('ROOT'): 6} {Synset('building.n.01'): 0, Synset('structure.n.01'): 1, Synset('artifact.n.01'): 2, Synset('whole.n.02'): 3, Synset('object.n.01'): 4, Synset('physical_entity.n.01'): 5, Synset('entity.n.01'): 6, Synset('ROOT'): 7}

Here, the connection goes through 'entity.n.01' which honestly is the root for almost all physical nouns. How can I get something better than this?

I'd like to get something like 'parking' -> 'structure' -> 'building'; it can be longer but "alien" words shouldn't be up in there, like for example 'monkey' which also zips to entity.

Upvotes: 4

Views: 309

Answers (1)

Peter.k
Peter.k

Reputation: 1548

Found some helpful way to view possibilities:

def getShortestHypernymPath(word1, word2, nulls=False):
    syns1 = wordnet.synsets(word1)
    syns2 = wordnet.synsets(word2)
    for s1 in syns1:
        for s2 in syns2:
            lch = s2.lowest_common_hypernyms(s1)
            if len(lch) > 0 or nulls:
                print(s1, '<-->', s2, '===', lch)

nlpf.getShortestHypernymPath('parking', 'building', nulls=False)

This returns:

Synset('parking.n.01') <--> Synset('building.n.01') === [Synset('entity.n.01')] Synset('parking.n.01') <--> Synset('construction.n.01') === [Synset('abstraction.n.06')] Synset('parking.n.01') <--> Synset('construction.n.07') === [Synset('abstraction.n.06')] Synset('parking.n.01') <--> Synset('building.n.04') === [Synset('abstraction.n.06')] Synset('parking.n.02') <--> Synset('building.n.01') === [Synset('entity.n.01')] Synset('parking.n.02') <--> Synset('construction.n.01') === [Synset('act.n.02')] Synset('parking.n.02') <--> Synset('construction.n.07') === [Synset('act.n.02')] Synset('parking.n.02') <--> Synset('building.n.04') === [Synset('abstraction.n.06')] Synset('park.v.02') <--> Synset('build.v.05') === [Synset('control.v.01')]

so I can at least mediate on it.

Upvotes: 2

Related Questions