Networkx: Calculating and storing shortest paths on a graph to a Pandas Data frame

Question

I have a pandas dataframe as shown below. There are many more columns in that frame that are not important concerning the task. The column id shows the sentenceID while the columns e1 and e2 contain entities (=words) of the sentence with their relationship in the column r

id     e1        e2          r
10     a-5       b-17        A 
10     b-17      a-5         N
17     c-1       a-23        N
17     a-23      c-1         N
17     d-30      g-2         N
17     g-20      d-30        B

I also created a graph for each sentence. The graph is created from a list of edges that looks somewhat like this

[('wordB-5', 'wordA-1'), ('wordC-8', 'wordA-1'), ...]

All of those edges are in one list (of lists). Each element in that list contains all the edges of each sentence. Meaning list[0] has the edges of sentence 0 and so on.

Now I want to perform operations like these:

graph = nx.Graph(graph_edges[i])
shortest_path = nx.shortest_path(graph, source="e1", 
target="e2")
result_length = len(shortest_path)
result_path = shortest_path

For each row in the data frame, I'd like to calculate the shortest paths (from the entity in e1 to the entity in e2 and save all of the results in a new column in the DataFrame but I have no idea how to do that.

I tried using constructions such as these

e1 = DF["e1"].tolist()
e2 = DF["e2"].tolist()
for id in Df["sentenceID"]:
    graph = nx.Graph(graph_edges[id])
    shortest_path = nx.shortest_path(graph,source=e1, target=e2)
result_length = len(shortest_path)
result_path = shortest_path

to create the data but it says the target is not in the graph.

new df=

id     e1        e2          r     length     path
10     a-5       b-17        A       4         ..
10     b-17      a-5         N       4         ..
17     c-1       a-23        N       3         ..
17     a-23      c-1         N       3         ..
17     d-30      g-2         N       7         ..
17     g-20      d-30        B       7         ..

Mi. · Accepted Answer

For anyone that's interested in the solution (thanks to Ram Narasimhan) :

 pathlist, len_list = [], []
 so, tar = DF["e1"].tolist(), DF["e2"].tolist()
 id = DF["id"].tolist()

 for _,s,t in zip(id, so, tar):
     graph = nx.Graph(graph_edges[_]) #Constructing each Graph
     try:
         path = nx.shortest_path(graph, source=s, target=t)
         length = nx.shortest_path_length(graph,source=s, target=t)
         pathlist.append(path)
         len_list.append(length)
     except nx.NetworkXNoPath:
         path = "No Path"
         length = "No Pathlength"
         pathlist.append(path)
         len_list.append(length)

 #Add these lists as new columns in the DF
 DF['length'] = len_list
 DF['path'] = pathlist

Networkx: Calculating and storing shortest paths on a graph to a Pandas Data frame

Answers (2)

Step 1: Build the graph and add edges, one by one

Step 2: Create a data frame of desired distances

Step 3: Calculate Shortest path and length, and store in the data frame

Related Questions