Reputation: 25
I am working with a large network. My data is stored in a relational table as an edge list which I import from a csv.file. Source and Target column values are integers. The network graph is directed. I know that the average in- degree should be higher than the average out-degree, yet networkx returns the same value for both measures. Is there something wrong with my code?
Here is my code
G= nx.from_pandas_edgelist(all, 'Source', 'Target', edge_attr=True, create_using= nx.DiGraph)
print(nx.info(G))
Dataframe = all
Source | Target | project_name | link_name | project_timestamp | link_timestamp |
---|---|---|---|---|---|
3331 | 2321 | A | aaa | 2013/01/02 | 2012/11/25 |
3332 | 2323 | B | kla | 2013/01/03 | 2012/06/06 |
3332 | 9093 | B | dyr | 2013/01/03 | 2012/02/03 |
Upvotes: 1
Views: 557
Reputation: 799
There doesn't seem to be anything wrong with your code, but the average of the in-degree should be equal to the average of the out-degree in a directed graph (see handshaking lemma on Wikipedia - it mentions directed graphs in the "Definitions and Statement" section).
With the example that you've shared, for example, you have 5 nodes and three edges:
3331 -> 2321
3331 -> 2323
3332 -> 9093
From that:
Node | in-degree | out-degree |
---|---|---|
2321 | 1 | 0 |
2323 | 1 | 0 |
3331 | 0 | 2 |
3332 | 0 | 1 |
9093 | 1 | 0 |
Average | 3/5 = 0.6 | 3/5 = 0.6 |
Every time an additional edge is added, the out-degree of the source node goes up by one, and the in-degree of the target goes up by one. Therefore, the sum of the in-degrees and the sum of the out-degrees remain equal to one another. Because the sums and number of nodes are always equal, the averages will be as well!
Upvotes: 2