Reputation: 235
I am new to python and algorithms. I have been trying to implement a topological sorting algorithm for a while but can't seem to create a structure that works. The functions I have made run on a graph represented in an adj list.
When I have a DFS, the nodes are discovered top down, and nodes that have been already visited and not processed again:
def DFS(location, graph, visited = None):
if visited == None:
visited = [False for i in range(len(graph))]
if visited[location] == True:
return
visited[location] = True
node_visited.append(location)
for node in graph[location]:
DFS(node, graph, visited)
return visited
When I am trying to build a topological sort algorithm, I create a new function which essentially checks the "availability" of that node to be added to the sorted list (ie: whether its neighbouring nodes have been visited already)
def availability(graph, node):
count = 0
for neighbour in graph[node]:
if neighbour in available_nodes:
count += 1
if count != 0:
return False
return True
However, my issue is that once I have visited the node path to get to the bottom of the graph, the DFS does not allow me to revisit that those nodes. Hence, any updates I make once I discover the end of the path can not be processed.
My approach may be totally off, but I am wondering if someone could help improve my implementation design, or explain how the implementation is commonly done. Thanks in advance.
Upvotes: 0
Views: 4960
Reputation: 5
class Graph:
def __init__(self):
self.edges = {}
def addNode(self, node):
self.edges[node] = []
def addEdge(self, node1, node2):
self.edges[node1] += [node2]
def getSub(self, node):
return self.edges[node]
def DFSrecu(self, start, path):
for node in self.getSub(start):
if node not in path:
path = self.DFSrecu(node, path)
if start not in path:
path += [start]
return path
def topological_sort(self, start):
topo_ordering_list = self.DFSrecu(start, [])
# this for loop it will help you to visit all nodes in the graph if you chose arbitrary node
# because you need to check if all nodes in the graph is visited and sort them
for node in g.edges:
if node not in topo_ordering_list:
topo_ordering_list = g.DFSrecu(node, topo_ordering_list)
return topo_ordering_list
if __name__ == "__main__":
g = Graph()
for node in ['S', 'B', 'A', 'C', 'G', 'I', "L", 'D', 'H']:
g.addNode(node)
g.addEdge("S", "A")
g.addEdge("S", "B")
g.addEdge("B", "D")
g.addEdge("D", "H")
g.addEdge("D", "G")
g.addEdge("H", "I")
g.addEdge("I", "L")
g.addEdge("G", "I")
last_path1 = g.topological_sort("D")
last_path2 = g.topological_sort("S")
print("Start From D: ",last_path1)
print("start From S: ",last_path2)
Output:
Start From D: ['L', 'I', 'H', 'G', 'D', 'A', 'B', 'S', 'C']
start From S: ['A', 'L', 'I', 'H', 'G', 'D', 'B', 'S', 'C']
you can see here 'C' is included in topological sorted list even it's not connect to any other node but 'C' in the graph and you need to visited her that's way you need for loop in topological_sort() function
Upvotes: 0
Reputation: 59144
You don't need that availability check to do a topological sort with DFS.
DFS itself ensures that you don't leave a node until its children have already been processed, so if you add each node to a list when DFS finishes with it, they will be added in (reverse) topological order.
Don't forget to do the whole graph, though, like this:
def toposort(graph):
visited = [False for i in range(len(graph))]
result = []
def DFS(node):
if visited[node]:
return
visited[node] = True
for adj in graph[node]:
DFS(adj)
result.append(node)
for i in range(len(graph)):
DFS(i)
return result
Upvotes: 2