Reputation: 13
I am new to NetworkX and I have a problem that, I think, might be quite general: how can I take a directed network, convert it to an undirected network, and in this process record some information about the edges in the original directed network?
Specifically, I have a DiGraph in NetworkX that records links from id_from to id_to. Attributes for each edge are the month of the link and a weight.
I would like to convert this directed graph to an undirected graph where I record as attributes:
Here is an example of the pandas dataframe that I start with:
In [12]: df
Out[12]:
id_from id_to total month
0 a b 100.0 2014-01-01
1 b a 10.0 2014-02-01
2 a c 15.0 2014-01-01
3 c d 7.0 2015-06-01
4 d c 500.0 2016-03-01
I read this as a DiGraph:
In [13]: G = nx.from_pandas_dataframe(df, 'id_from', 'id_to', edge_attr = True, create_using = nx.DiGraph())
In [14]: print(G.edges(data = True))
Out[14]: [(a, b, {'id_from': a, 'id_to': b, 'amount': 100.0, 'month': Timestamp('2014-01-01 00:00:00')}), (b, a, {'id_from': b, 'id_to': a, 'amount': 10.0, 'month': Timestamp('2014-02-01 00:00:00')}), (a, c, {'id_from': a, 'id_to': c, 'amount': 15.0, 'month': Timestamp('2014-01-01 00:00:00')}), (c, d, {'id_from': c, 'id_to': d, 'amount': 7.0, 'month': Timestamp('2015-06-01 00:00:00')}), (d, c, {'id_from': d, 'id_to': c, 'amount': 500.0, 'month': Timestamp('2016-03-01 00:00:00')})]
And then I would ultimately like to get back a graph, which I can then convert back into a pandas dataframe at some point, that looks like:
id_one id_two total first_month last_month nr_months bidirect
0 a b 110.0 2014-01-01 2014-02-01 2.0 Yes
1 a c 15.0 2014-02-01 2014-02-01 1.0 No
2 c d 507.0 2015-06-01 2016-03-01 2.0 Yes
Can anyone help me with this?
I can't seem to find any questions that are similar, but please correct me if I am wrong. Any help is much appreciated.
Upvotes: 1
Views: 2371
Reputation: 1845
A possible approach is to use G.to_undirected()
for transforming the directed graph to an undirected graph. Then iterating over the edges for updating the desired properties for each edge, and finally converting the graph to a dataframe:
import pandas as pd
import networkx as nx
import datetime
data = {
'id_from': ['a', 'b', 'a', 'c', 'd'],
'id_to': ['b', 'a', 'c', 'd', 'c'],
'total': [100.0, 10.0, 15.0, 7.0, 500.0],
'month': [datetime.datetime(2014, 1, 1), datetime.datetime(2014, 2, 1), datetime.datetime(2014, 1, 1), datetime.datetime(2015, 6, 1), datetime.datetime(2016, 3, 1)],
}
df = pd.DataFrame(data)
G = nx.from_pandas_edgelist(df, 'id_from', 'id_to', edge_attr=True, create_using=nx.DiGraph())
undirected = G.to_undirected()
for edge in undirected.edges(data=True):
direction_1 = df.ix[(df['id_from'] == edge[0]) & (df['id_to'] == edge[1])]
direction_2 = df.ix[(df['id_from'] == edge[1]) & (df['id_to'] == edge[0])]
edges = pd.concat([direction_1, direction_2])
edge[2]['bidirect'] = 'Yes' if (not direction_1.empty) & (not direction_2.empty) else 'No'
edge[2]['total'] = edges['total'].sum()
edge[2]['first_month'] = edges['month'].min()
edge[2]['last_month'] = edges['month'].max()
edge[2]['nr_months'] = edges['month'].nunique()
del edge[2]['month']
print(nx.to_pandas_edgelist(undirected))
The result is:
bidirect first_month last_month nr_months source target total
0 No 2014-01-01 2014-01-01 1 a c 15.0
1 Yes 2014-01-01 2014-02-01 2 a b 110.0
2 Yes 2015-06-01 2016-03-01 2 c d 507.0
Each edge in a networkx graph, is essentially a tuple, where the first 2 elements are the edge's nodes and the last element (edge[2]) is a dictionary with the edge's properties. Hence, we can simply update this dictionary according to the desired logic.
Upvotes: 0