Reputation: 19
I have the following data frame for critical path problem:
Activity = ['A','B','C','D','E','F','G']
Predecessor = [None, None, None, 'A', 'C', 'A', ['B','D','E']]
Durations = [2,6,4,3,5,4,2]
df = pd.DataFrame(zip(Activity, Predecessor, Durations),
columns = ['Activity','Predecessor','Durations'])
print(df)
Activity Predecessor Durations
0 A None 2
1 B None 6
2 C None 4
3 D A 3
4 E C 5
5 F A 4
6 G [B, D, E] 2
The goal is to create new column with total duration times. For example for an activity 'D' it's predecessor is 'A'. Also for an activity 'G' one of it's predecessors is 'D'. From this, the total duration from 'A' to 'G' including 'D' should be equal to 7 (i.e. 3+2+2=7). According to such logic, it is necessary to obtain all possible path durations. For more illustration, you can see the following picture of network flow:
In addition, it would be great to get visualization of a network-flow using python libraries. I hope enyone can give some ideas about it.
Upvotes: -1
Views: 250
Reputation: 260620
The output you expect is not fully clear, but you can use networkx
to create and handle your graph.
Here is the graph:
import networkx as nx
durations = df.set_index('Activity')['Durations']
# create graph
G = nx.from_pandas_edgelist(df.explode('Predecessor').fillna('Root'),
create_using=nx.DiGraph,
source='Predecessor', target='Activity')
# get sum of predecessors duration for each node
df['total'] = [sum(durations.get(n) for n in G.predecessors(node) if n!= 'Root')
for node in df['Activity']]
# add self
df['total'] += df['Durations']
output:
Activity Predecessor Durations total
0 A None 2 2
1 B None 6 6
2 C None 4 4
3 D A 3 5
4 E C 5 9
5 F A 4 6
6 G [B, D, E] 2 16
Upvotes: 1