How to calculate index from python networkx by group(attribute)

Question

I have a table which contains 'from', 'to', 'date' columns.

I want to get any networkx index(e.g. degree, edges, nodes) by 'date'.

In reality there are a lot of date, it's impossible to calculate index by manually.

Is there any way to calculte degree() or edges() based on 'date' ?

Thank you for reading.

Example code is as below.

df = pd.DataFrame({'from' : ['1','2','1','3'], 
                   'to' : ['3','3','2','2'], 
                   'date' : ['20200501','20200501','20200502','20200502']})

G = nx.from_pandas_edgelist(df, source = 'from', target = 'to',
                            create_using=nx.DiGraph(), edge_attr = 'date')

# It's easy to calculate any index such as 'degree','node','edge'.

G.nodes()
G.degree()
G.edge()

# However, it's not easy to calculate an index based on 'date' column.

yatu · Accepted Answer

To inspect those edges which contain a certain date as attribute, iterate over the the edges, setting data=True and keep the edges that match. Then generate a new graph induced by those edges using Graph.edge_subgraph:

edges_from_date_x = [] 
some_date = '20200502'
for *edge, attr in G.edges(data=True):
    if attr['date'] == some_date:
        edges_from_date_x.append((*edge,))

print(edges_from_date_x)
# [('1', '2'), ('3', '2')]

Or if you prefer list-comps you could do as suggested by @AKX:

edges_from_date_x = [(*edge,) for *edge, attr in G.edges(data=True)
                     if attr['date'] == some_date]
# [('1', '2'), ('3', '2')]

Now generate the induced subgraph:

# induced subgraph
G_induced = G.edge_subgraph(edges_from_date_x)
# edgelist from the induced subgraph
G_induced.edges(data=True)
#OutEdgeDataView([('1', '2', {'date': '20200502'}), ('3', '2', {'date': '20200502'})])
# same with the nodes
G.nodes()
# NodeView(('1', '3', '2'))

How to calculate index from python networkx by group(attribute)

Answers (1)

Related Questions