Tejas Bawaskar
Tejas Bawaskar

Reputation: 306

Optimization: Create a list of clusters in the most optimized way

I have a data frame that consists of pairs, it looks like this:

Col_1  Col_2

A      B 
B      G
A      C
D      F
E      F   

Now, from this dataframe of pairs, I want to create a list as output that looks like this:

output[[1]]
> A B G C

output[[2]]
> D F E

The order of the output does not matter, (both of all the elements and within an element)

I have written some lengthy code, which looks inefficient to me. (I'm happy to share it if anyone wants to see it) Is there any efficient way to tackle this problem?

Upvotes: 3

Views: 67

Answers (1)

tmfmnk
tmfmnk

Reputation: 40131

One way involves the igraph library:

clusters <- clusters(graph.data.frame(df, directed = FALSE))$membership
split(names(clusters), clusters)

$`1`
[1] "A" "B" "G" "C"

$`2`
[1] "D" "E" "F"

Upvotes: 1

Related Questions