williamg15
williamg15

Reputation: 77

Selecting clusters based on number of nodes iGraph/R

Is there a way to select a subgraph/subset where clusters have maximum number of vertices?

Essentially I want to do something like:

want <- components(X)$csize < 20  

I thought about merging the cluster id's from the graph data frame to the node df, then using a count, or something similar, to subset the original df and compute the graph data frame again.

Upvotes: 1

Views: 795

Answers (1)

struggles
struggles

Reputation: 865

Here is a potential solution using a random graph. You will need to use groups on the components to identify which nodes belong to which components, then you will need to use length to identify how big the components are:

set.seed(4321)
g <- sample_gnm(100, 40, F, F)
plot(g, vertex.size = 5, vertex.label = '')

The entire graph with all components

want <- g %>%
  components %>%
  groups %>%
  .[sapply(., length) > 3]

want will provide the following:

$`1`
[1]  1 34 38 45 75

$`3`
 [1]   3  12  24  39  50  54  58  60  67  84  97  99 100

$`5`
[1]  5 35 37 41 44 53 65 90

Then you can remove all nodes that aren't included in want

newG <- g %>%
  {. - V(.)[! as.numeric(V(.)) %in% unlist(want)]}

plot(newG, vertex.size = 5, vertex.label = '')

enter image description here

Upvotes: 1

Related Questions