Cheshie
Cheshie

Reputation: 2837

Find a path in an nltk.tree.Tree

I'm using nltk.tree.Tree in order to read a constituency-based parse tree. I need to find the path of nodes I need to move through to get from one specific word in the tree, to another.

A quick example:

This is the parse tree of the sentece "saw the dog":

(VP (VERB saw) (NP (DET the) (NOUN dog)))

If I want the path between the word the and dog, it would be: DET, NP, NOUN.

I'm not even sure how to start: how do I find the values of the leaves of the tree? How could I find a leave's/node's parent?

Thanks.

Upvotes: 2

Views: 1525

Answers (1)

Nikita Astrakhantsev
Nikita Astrakhantsev

Reputation: 4749

Here is the code:

def get_lca_length(location1, location2):
    i = 0
    while i < len(location1) and i < len(location2) and location1[i] == location2[i]:
        i+=1
    return i

def get_labels_from_lca(ptree, lca_len, location):
    labels = []
    for i in range(lca_len, len(location)):
        labels.append(ptree[location[:i]].label())
    return labels

def findPath(ptree, text1, text2):
    leaf_values = ptree.leaves()
    leaf_index1 = leaf_values.index(text1)
    leaf_index2 = leaf_values.index(text2)

    location1 = ptree.leaf_treeposition(leaf_index1)
    location2 = ptree.leaf_treeposition(leaf_index2)

    #find length of least common ancestor (lca)
    lca_len = get_lca_length(location1, location2)

    #find path from the node1 to lca

    labels1 = get_labels_from_lca(ptree, lca_len, location1)
    #ignore the first element, because it will be counted in the second part of the path
    result = labels1[1:]
    #inverse, because we want to go from the node to least common ancestor
    result = result[::-1]

    #add path from lca to node2
    result = result + get_labels_from_lca(ptree, lca_len, location2)
    return result

ptree = ParentedTree.fromstring("(VP (VERB saw) (NP (DET the) (NOUN dog)))")
print(ptree.pprint())
print(findPath(ptree, 'the', "dog"))

It is based on list representation of trees, see here. Also check similar questions.

Upvotes: 1

Related Questions