user1427661
user1427661

Reputation: 11774

Creating Adjacency Lists from Dicts in Python

I'm used to using dicts to represent graphs in python, but I'm running into some serious performance issues with large graphs and complex calculations, so I think I should cross over to using adjacency matrixes to bypass the overhead of hash tables. My question is, if I have a graph of the form g: {node: {vertex: weight . . . } . . . }, what would be the most efficient way to convert that into a list-based adjacency matrix?

Upvotes: 3

Views: 9856

Answers (2)

Aditya P
Aditya P

Reputation: 1074

Well to implement in a adjacency list, you can create two classes, one for storing the information about the vertex's.

# Vertex, which will represent each vertex in the graph.Each Vertex uses a dictionary
# to keep track of the vertices to which it is connected, and the weight of each edge. 
class Vertex:

# Initialze a object of this class
# we use double underscore 
def __init__(self, key):
    # we identify the vertex with its key
    self.id = key
    # this stores the info about the various connections any object 
    # (vertex) of this class has using a dictionary which is called connectedTo.
    # initially its not connected to any other node so,
    self.connectedTo={}

# Add the information about connection between vertexes into the dictionary connectedTo
def addNeighbor(self,neighbor,weight=0):
    # neighbor is another vertex we update the connectedTo dictionary ( Vertex:weight )
    # with the information of this new Edge, the key is the vertex and
    # the edge's weight is its value. This is the new element in the dictionary
    self.connectedTo[neighbor] = weight

# Return a string containing a nicely printable representation of an object.
def __str__(self):
    return str(self.id) + ' connectedTo: ' + str([x.id for x in self.connectedTo])

# Return the vertex's self is connected to in a List
def getConnections(self):
    return self.connectedTo.keys()

# Return the id with which we identify the vertex, its name you could say
def getId(self):
    return self.id

# Return the value (weight) of the edge (or arc) between self and nbr (two vertices)
def getWeight(self,nbr):
    return self.connectedTo[nbr]

As you can see from the comments, each vertex stores a 'key' which is used to identify it, and it had a dictionary 'connectedTo' which holds the key-weight pairs of all connections from this vertex. key of the connected vertex and weight of the edge.

Now we can store a collection of such vertex's using the Graph class which can be implemented like this,

# The Graph class contains a dictionary that maps vertex keys to vertex objects (vertlist) and a    count of the number of vertices in the graph
class Graph:

def __init__(self):
    self.vertList = {}
    self.numVertices = 0


# Returns a vertex which was added to the graph with given key
def addVertex(self,key):
    self.numVertices = self.numVertices + 1
    # create a vertex object
    newVertex = Vertex(key)
    # set its key
    self.vertList[key] = newVertex
    return newVertex

# Return the vertex object corresponding to the key - n
def getVertex(self,n):
    if n in self.vertList:
        return self.vertList[n]
    else:
        return None

# Returns boolean - checks if graph contains a vertex with key n
def __contains__(self,n):
    return n in self.vertList

# Add's an edge to the graph using addNeighbor method of Vertex
def addEdge(self,f,t,cost=0):
    # check if the 2 vertices involved in this edge exists inside
    # the graph if not they are added to the graph
    # nv is the Vertex object which is part of the graph
    # and has key of 'f' and 't' respectively, cost is the edge weight
    if f not in self.vertList:
        nv = self.addVertex(f)
    if t not in self.vertList:
        nv = self.addVertex(t)
    # self.vertList[f] gets the vertex with f as key, we call this Vertex
    # object's addNeighbor with both the weight and self.vertList[t] (the vertice with t as key) 
    self.vertList[f].addNeighbor(self.vertList[t], cost)

# Return the list of all key's corresponding to the vertex's in the graph
def getVertices(self):
    return self.vertList.keys()

# Returns an iterator object, which contains all the Vertex's
def __iter__(self):
    return iter(self.vertList.values())

here, we have the graph class which holds the number of vertex's in 'numVertices' and has the dictionary 'vertList' which has the key and Vertex (the class we just made) objects present in the graph. we can create a graph and set it up by calling

# Now lets make the graph
the_graph=Graph()

print "enter the number of nodes in the graph"
no_nodes=int(raw_input())

# setup the nodes
for i in range(no_nodes):
        print "enter the "+str(i+1)+" Node's key"
        the_graph.addVertex(raw_input())

print "enter the number of edges in the graph"
no_edges=int(raw_input())
print "enter the maximum weight possible for any of edges in the graph"
max_weight=int(raw_input())


# setup the edges
for i in range(no_edges):
    print "For the "+str(i+1)+" Edge, "
    print "of the 2 nodes involved in this edge \nenter the first Node's key"
    node1_key=raw_input()
    print "\nenter the second Node's key"
    node2_key=raw_input()
    print "\nenter the cost (or weight) of this edge (or arc) - an integer"
    cost=int(raw_input())
    # add the edge with this info
    the_graph.addEdge(node1_key,node2_key,cost)

If you want undirected graphs then add this line the_graph.addEdge(node2_key,node1_key,cost) Thus the connection will be stored not as a connected to b but a connected to b and b connected to a. Also mind the indentation for both class implementations above, it might be incorrect.

Upvotes: 0

J. Katzwinkel
J. Katzwinkel

Reputation: 1953

Probably not the most efficient, but a simple way to convert your format to an adjacency matrix on a list-basis could look like this:

g = {1:{2:.5, 3:.2}, 2:{4:.7}, 4:{5:.6, 3:.3}}
hubs = g.items() # list of nodes and outgoing vertices
size=max(map(lambda hub: max(hub[0], max(hub[1].keys())), hubs))+1 # matrix dimension is highest known node index + 1
matrix=[[None]*size for row in range(size)] # set up a matrix of the appropriate size

for node, vertices in hubs: # loop through every node in dictionary
    for vertice, weight in vertices.items(): # loop through vertices
        matrix[vertice][node] = weight # define adjacency of both nodes by assigning the vertice's weight

This works for directed graphs assuming that the nodes are represented simply by their indexes starting with zero. Here is a visualization and the resulting matrix for the graph processed in this sample:

     0    1    2    3    4    5
   ------------------------------
0 |                              
1 |                              
2 |      0.5                     
3 |      0.2            0.3      
4 |           0.7                
5 |                     0.6      

If your graph is in fact undirected, I could think of some chances to optimize. In case the dictionary containes every node as a key with all its vertices listed, like {1:{2:.2, 3:.3}, 2:{1:.2}, 3:{1:.3}}, you could sort the corresponding list before traversing and limit the inner loop:

hubs = sorted(g.items())
for node, vertices in hubs:
    for vertice, weight in reversed(sorted(vertices.items())):
        if vertice >= node: 
            matrix[vertice][node] = weight
            matrix[node][vertice] = weight
        else: # do only care about vertices that haven't been saved before,
            break # continue with next node when the current one won't introduce any more vertices

You can probably have this more efficient by using binary search. Since the resulting matrix will obviously be a mirror-symmetric one, you could also go further and only store one half of it. Easiest way to do this is maybe to flip it on the vertical axis:

# unlike the one before, this sample doesn't rely on the dictionary containing every vertice twice
matrix=[[None]*size for row in range(size)]  
for node, vertices in hubs:
    for vertice, weight in vertices.items():
        matrix[vertice][size-node-1] = weight

Because of one half of the matrix being cut off, not every lookup for the vertice between nodes (u,v) will work, so it has to be made sure that the index of the column is greater than the row's for the cell to look up:

u,v = sorted((u,v))
weight = matrix[v][u]

Good luck!

Upvotes: 3

Related Questions