Sachin S
Sachin S

Reputation: 396

Python - Find descendants and ancestors from the unsorted hierarchy file

I have a unsorted parent-child hierarchy file (tab delimited) in the following format:

City1   Area1
City1   Area2
Continent1  Country1
Continent2  Country2
Continent3  Country3
Continent4  Country4
Continents  Continent1
Continents  Continent2
Continents  Continent3
Continents  Continent4
Country1    State1
Country2    State2
Country3    State3
Earth   Continents
State1  City1
State1  City1.1
State2  City2

My goal is to find all the "descendants" and "ancestors" of given member.

Here is what I have coded do far:

import sys, re

with open("input.txt", "r") as my_in:
    collections={}
    for line in my_in:
        parent, child=line.rstrip('\r\n').split('\t')
        collections.setdefault(parent, []).append(child)

print (collections)
'''
{'Continent4': ['Country4'], 'Continent2': ['Country2'], 
'Continents': ['Continent1', 'Continent2', 'Continent3', 'Continent4'], 
'Continent1': ['Country1'], 'Country2': ['State2'], 
'Country3': ['State3'], 'State1': ['City1', 'City1.1'], 
'Country1': ['State1'], 'State2': ['City2'], 
'Earth': ['Continents'], 'City1': ['Area1', 'Area2'], 'Continent3': ['Country3']}
'''

def find_descendants(parent, collections):
descendants = []
for descendant in collections[parent]:
    if descendant in collections:
        descendants = descendants + find_descendants(descendant, collections)
    else:
        descendants.append(descendant)
return descendants

# Get descendants of "Continent1":
lis=find_descendants("Continent1", collections)
print (lis) # It shows ['Area1', 'Area2', 'City1.1']
# Actually it should show ['Country1', 'State1', 'City1', 'Area1', 'Area2',   'City1.1']

def find_ancestors(child, collections):
    # pseudo code
    # link child to its parent and parent to its parent until no more parents are found
    pass

# lis=find_ancestors("City1.1", collections)
# should show ['Earth', 'Continents', 'Continent1', 'Country1', 'State1']

The function find_descendants is not working as expected. And as far as find_ancestors function is concerned, although I know the pseudo code, I am not able to express it in Python.

Please help.

Upvotes: 0

Views: 1509

Answers (2)

fenceop
fenceop

Reputation: 1497

Here's a simpler solution that uses networkx:

import networkx as nx

coll = nx.DiGraph()
with open("input.txt") as f:
    for line in map(str.strip, f):
        ancestor, descendant = line.split("\t")
        coll.add_edge(ancestor, descendant)

print(nx.descendants(coll, "Continent1"))
# {'Area2', 'City1.1', 'Area1', 'City1', 'State1', 'Country1'}

print(nx.ancestors(coll, "City1.1"))
# {'Earth', 'Continent1', 'State1', 'Continents', 'Country1'}

Both functions return a set so the ancestors and descendants are not ordered.

Upvotes: 0

RaphaMex
RaphaMex

Reputation: 2839

As I said in comments, you forget to append your descendant before looking deeper in your collection. This works:

def find_descendants(parent, collections):
    descendants = []
    for descendant in collections[parent]:
        descendants.append(descendant)
        if descendant in collections:
            descendants = descendants + find_descendants(descendant, collections)
    return descendants

For ancestors, just build an other collections, say ancestors_collection, that stores the reverse relation descendant/ancestor. The function to find ancestors should then be exactly the same as find_descendants, which you can rename accordingly.

EDIT:

Here a complete working code, I use relative to refer to ancestor or descendant:

import sys, re

with open("input.txt", "r") as my_in:
    descendants={}
    ancestors={}
    for line in my_in:
        parent, child=line.rstrip('\r\n').split('\t')
        descendants.setdefault(parent, []).append(child)
        ancestors.setdefault(child, []).append(parent)

def get_relatives(element, collection):
    relatives = []
    for relative in collection[element]:
        relatives.append(relative)
        if relative in collection:
            relatives = relatives + get_relatives(relative, collection)
    return relatives

# Get descendants of "Continent1":
lis=get_relatives("Continent1", descendants)
print (lis)
# shows ['Country1', 'State1', 'City1', 'Area1', 'Area2',   'City1.1']

lis=get_relatives("City1.1", ancestors)
print (lis)
# shows ['Earth', 'Continents', 'Continent1', 'Country1', 'State1']

Upvotes: 1

Related Questions