Sushant
Sushant

Reputation: 469

Get all the connected components from a set of pairs in pandas dataframe

I have a dataframe with one column as pairs and other as keys. I want to create sets of chains of linked indexes:

       match1  match2
0          12       5
1          34      12
2          18      29
3          69      31
4          33      34
5.         15     69

I'd like to get an output that links all the connected components into lists maybe like this:

[12,5,34,33], [18,29], [69, 31, 15]

EDIT: I had tried this earlier.

        rev_matches = df[['match1', 'match2']]
        rev_matches['match_list'] = rev_matches.values.tolist()
        rev_matches = rev_matches[['match_list']]
        rev_matches['Key'] = rev_matches.index
        rev_matches = rev_matches.explode('match_list')
        G = nx.from_pandas_edgelist(rev_matches, 'match_list', 'Key')
        l = list(nx.connected_components(G))

Now this didn't worked out as well. The connections made were inaccurate. Can someone also explain me where is this wrong. Thanks

Upvotes: 3

Views: 1302

Answers (1)

anky
anky

Reputation: 75110

This looks like networkx:

import networkx as nx
G = nx.Graph()
G.add_edges_from(df[['match1','match2']].to_numpy().tolist())
print(list(nx.connected_components(G)))
#[{5, 12, 33, 34}, {18, 29}, {15, 31, 69}]

Upvotes: 6

Related Questions