Reputation: 41
I am trying to write a function that takes a graph and return a DataFrame with the 1st column being the list of the node with the highest centrality measure and the 2nd column being the value of the highest centrality measure.
Attached below is my code and I could not figure out how to complete it. How can I find the highest centrality without having another function?
def summary(G):
df = pd.DataFrame()
dc=nx.degree_centrality(G)
cc=nx.closeness_centrality(G)
bc=nx.closeness_centrality(G)
df['Nodes with the highest centrality measure']= #addcodehere
df['Value of the highest centrality measure']= #addcodehere
return df.set_index(['dc','cc','bc'])
Upvotes: 4
Views: 1574
Reputation: 10020
You can do it this way:
# Imports and graph creation (you don't need them in your function)
import networkx as nx
import pandas as pd
G = nx.fast_gnp_random_graph(20, 0.1)
Create the centrality dict:
cc = nx.closeness_centrality(G)
cc
is looks like this:
{0: 0.28692699490662144, 1: 0.26953748006379585, 2: 0.32943469785575047, 3: 0.28692699490662144, 4: 0.30671506352087113, 5: 0.26953748006379585, ...
Then use from_dict
to create the dataframe:
df = pd.DataFrame.from_dict({
'node': list(cc.keys()),
'centrality': list(cc.values())
})
df
is looks like this:
centrality node 0 0.286927 0 1 0.269537 1 2 0.329435 2 3 0.286927 3 4 0.306715 4 5 0.269537 5 ...
And then sort it by centrality with descending order:
df = df.sort_values('centrality', ascending=False)
So df
is looks like this:
centrality node 12 0.404306 12 7 0.386728 7 2 0.329435 2 4 0.306715 4 0 0.286927 0 ...
And return the result. The full code is:
def summary(G):
cc = nx.closeness_centrality(G)
df = pd.DataFrame.from_dict({
'node': list(cc.keys()),
'centrality': list(cc.values())
})
return df.sort_values('centrality', ascending=False)
Upvotes: 4