Reputation: 121
I have the following dataset:
Company_ID Firm_Name
125911 Ampersand
125911 BancBoston
32679 BP Corp
74240 CORNING
32679 DIEBOLD
32679 DIEBOLD
74240 Fidelity
74240 Greylock
32679 INCO
67734 INCO
67734 Innova
32679 Kleiner
67734 Kleiner
67734 Kleiner
67734 Mayfield
32679 Pliant
67734 Pliant
67734 Sofinnova
43805 Warburg
The dataframe shows when different investment firms have invested in the same Company during a year. I want to create a network graph of the Connections between the Firm_ID only. For example Ampersand and BancBoston have both invested in the same company and should therefore be connected. The code I have tried is:
G = nx.Graph()
G = nx.from_pandas_edgelist(df, 'Company_ID', 'Firm_Name')
nx.draw_shell(H, with_labels=True)
Which generates the following graph:
This shows the connections of both Company_ID and Firm_Name. I only want to have the Firms as nodes, where they are connected if they have invested in the same company. I have not found any similar problems or similar datasets where networkx is used. Any help is greatly appreciated!
Upvotes: 0
Views: 1162
Reputation: 323226
Try with merge
out = df.merge(df,on=['Company_ID'])
G = nx.Graph()
G = nx.from_pandas_edgelist(df, 'Firm_Name_x', 'Firm_Name_y')
Upvotes: 2