Reputation: 461
i have DAG that i'am convert to pandas_DF
the DF is:
df=pd.DataFrame({'dad':[1, 2, 3, 4,5, "T1", "T2"],
'children':["T1","T1","T2","T2",6,"T3","T3"]})
print (df)
i want to get list of all the nodes\edges that connected in my DAG (graph) so it will look like this
df=pd.DataFrame({'dad':[1, 2, 3, 4,5, "T1", "T2","T3"],
'children':["T1","T1","T2","T2",6,"T3","T3","X"],
'chain':[0,0,0,0,0,[1,2],[3,4],[1,2,3,4,"T1","T2"]] })
i like to know the connection between the edges all over the chain, like the new column "chain" . its can be a new column like here ,and the order is not important too
i use pandas and networkx, but i will be happy to know a new library of DAG like networkx for python.
The graph looks like it has 2 trees inside
Upvotes: 2
Views: 728
Reputation: 153550
You can use networkx
as @QuangHoang suggests like this:
import pandas as pd
import networkx as nx
df=pd.DataFrame({'dad':[1, 2, 3, 4,5, "T1", "T2"],
'children':["T1","T1","T2","T2",6,"T3","T3"]})
G = nx.from_pandas_edgelist(df, 'dad','children', create_using=nx.DiGraph())
df['chain'] = df['dad'].transform(lambda x: list(G.predecessors(x)))
df
Output:
dad children chain
0 1 T1 []
1 2 T1 []
2 3 T2 []
3 4 T2 []
4 5 6 []
5 T1 T3 [1, 2]
6 T2 T3 [3, 4]
I think you need all the components of the DiGraph... here is a way to generate those subgraphs with chains.
import pandas as pd
import networkx as nx
df=pd.DataFrame({'dad':[1, 2, 3, 4,5, "T1", "T2"],
'children':["T1","T1","T2","T2",6,"T3","T3"]})
G = nx.from_pandas_edgelist(df, 'dad','children', create_using=nx.DiGraph())
df['chain'] = df['dad'].transform(lambda x: list(G.predecessors(x)))
w_list = list(nx.weakly_connected_components(G))
df_comp = pd.DataFrame({'dad': [list(n)[-1] for n in w_list],
'children':['X' for _ in w_list],
'chain': [list(x) for x in w_list]})
df_out = pd.concat([df, df_comp])
df_out
Output:
dad children chain
0 1 T1 []
1 2 T1 []
2 3 T2 []
3 4 T2 []
4 5 6 []
5 T1 T3 [1, 2]
6 T2 T3 [3, 4]
0 T3 X [1, 2, 3, 4, T1, T2, T3]
1 6 X [5, 6]
Upvotes: 4