Reputation: 396
I have a unsorted parent-child hierarchy file (tab delimited) in the following format:
City1 Area1
City1 Area2
Continent1 Country1
Continent2 Country2
Continent3 Country3
Continent4 Country4
Continents Continent1
Continents Continent2
Continents Continent3
Continents Continent4
Country1 State1
Country2 State2
Country3 State3
Earth Continents
State1 City1
State1 City1.1
State2 City2
My goal is to find all the "descendants" and "ancestors" of given member.
Here is what I have coded do far:
import sys, re
with open("input.txt", "r") as my_in:
collections={}
for line in my_in:
parent, child=line.rstrip('\r\n').split('\t')
collections.setdefault(parent, []).append(child)
print (collections)
'''
{'Continent4': ['Country4'], 'Continent2': ['Country2'],
'Continents': ['Continent1', 'Continent2', 'Continent3', 'Continent4'],
'Continent1': ['Country1'], 'Country2': ['State2'],
'Country3': ['State3'], 'State1': ['City1', 'City1.1'],
'Country1': ['State1'], 'State2': ['City2'],
'Earth': ['Continents'], 'City1': ['Area1', 'Area2'], 'Continent3': ['Country3']}
'''
def find_descendants(parent, collections):
descendants = []
for descendant in collections[parent]:
if descendant in collections:
descendants = descendants + find_descendants(descendant, collections)
else:
descendants.append(descendant)
return descendants
# Get descendants of "Continent1":
lis=find_descendants("Continent1", collections)
print (lis) # It shows ['Area1', 'Area2', 'City1.1']
# Actually it should show ['Country1', 'State1', 'City1', 'Area1', 'Area2', 'City1.1']
def find_ancestors(child, collections):
# pseudo code
# link child to its parent and parent to its parent until no more parents are found
pass
# lis=find_ancestors("City1.1", collections)
# should show ['Earth', 'Continents', 'Continent1', 'Country1', 'State1']
The function find_descendants is not working as expected. And as far as find_ancestors function is concerned, although I know the pseudo code, I am not able to express it in Python.
Please help.
Upvotes: 0
Views: 1509
Reputation: 1497
Here's a simpler solution that uses networkx
:
import networkx as nx
coll = nx.DiGraph()
with open("input.txt") as f:
for line in map(str.strip, f):
ancestor, descendant = line.split("\t")
coll.add_edge(ancestor, descendant)
print(nx.descendants(coll, "Continent1"))
# {'Area2', 'City1.1', 'Area1', 'City1', 'State1', 'Country1'}
print(nx.ancestors(coll, "City1.1"))
# {'Earth', 'Continent1', 'State1', 'Continents', 'Country1'}
Both functions return a set so the ancestors and descendants are not ordered.
Upvotes: 0
Reputation: 2839
As I said in comments, you forget to append your descendant before looking deeper in your collection. This works:
def find_descendants(parent, collections):
descendants = []
for descendant in collections[parent]:
descendants.append(descendant)
if descendant in collections:
descendants = descendants + find_descendants(descendant, collections)
return descendants
For ancestors, just build an other collections
, say ancestors_collection
, that stores the reverse relation descendant/ancestor. The function to find ancestors should then be exactly the same as find_descendants, which you can rename accordingly.
EDIT:
Here a complete working code, I use
relative
to refer to ancestor or descendant:
import sys, re
with open("input.txt", "r") as my_in:
descendants={}
ancestors={}
for line in my_in:
parent, child=line.rstrip('\r\n').split('\t')
descendants.setdefault(parent, []).append(child)
ancestors.setdefault(child, []).append(parent)
def get_relatives(element, collection):
relatives = []
for relative in collection[element]:
relatives.append(relative)
if relative in collection:
relatives = relatives + get_relatives(relative, collection)
return relatives
# Get descendants of "Continent1":
lis=get_relatives("Continent1", descendants)
print (lis)
# shows ['Country1', 'State1', 'City1', 'Area1', 'Area2', 'City1.1']
lis=get_relatives("City1.1", ancestors)
print (lis)
# shows ['Earth', 'Continents', 'Continent1', 'Country1', 'State1']
Upvotes: 1