Reputation: 3005
I have a large Graph Network generated using Networkx
package.
Here I'm adding a sample
import networkx as nx
import pandas as pd
G = nx.path_graph(4)
nx.add_path(G, [10, 11, 12])
I'm trying to create a dataframe
with Node, degrees, component id, component.
Created degrees using
degrees = list(nx.degree(G))
data = pd.DataFrame([list(d) for d in degrees], columns=['Node', 'degree']).sort_values('degree', ascending=False)
extracted components using
Gcc = sorted(nx.connected_components(G), key=len, reverse=True)
Gcc
[{0, 1, 2, 3}, {10, 11, 12}]
And not sure how I can create the Component ID
and components
in the data.
Required output:
Node degree ComponentID Components
1 1 2 1 {0, 1, 2, 3}
2 2 2 1 {0, 1, 2, 3}
5 11 2 2 {10, 11, 12}
0 0 1 1 {0, 1, 2, 3}
3 3 1 1 {0, 1, 2, 3}
4 10 1 2 {10, 11, 12}
6 12 1 2 {10, 11, 12}
How to generate the component ids and add them to the nodes and degrees?
Upvotes: 3
Views: 641
Reputation: 71707
Create triplets of Node
, ComponentId
and Component
by enumerating over the connected component list, then create a new dataframe from these triplets and merge
it with the given dataframe on Node
df = pd.DataFrame([(n, i, c) for i,c in enumerate(Gcc, 1) for n in c],
columns=['Node', 'ComponentID', 'Components'])
data = data.merge(df, on='Node')
Alternatively you can use map
instead of merge
to individually create ComponentID
and Components
columns
d = dict(enumerate(Gcc, 1))
data['ComponentID'] = data['Node'].map({n:i for i,c in d.items() for n in c})
data['Components'] = data['ComponentID'].map(d)
print(data)
Node degree ComponentID Components
1 1 2 1 {0, 1, 2, 3}
2 2 2 1 {0, 1, 2, 3}
5 11 2 2 {10, 11, 12}
0 0 1 1 {0, 1, 2, 3}
3 3 1 1 {0, 1, 2, 3}
4 10 1 2 {10, 11, 12}
6 12 1 2 {10, 11, 12}
Upvotes: 3