Reputation: 77
Is there a way to select a subgraph/subset where clusters have maximum number of vertices?
Essentially I want to do something like:
want <- components(X)$csize < 20
I thought about merging the cluster id's from the graph data frame to the node df, then using a count, or something similar, to subset the original df and compute the graph data frame again.
Upvotes: 1
Views: 795
Reputation: 865
Here is a potential solution using a random graph. You will need to use groups
on the components
to identify which nodes belong to which components, then you will need to use length
to identify how big the components are:
set.seed(4321)
g <- sample_gnm(100, 40, F, F)
plot(g, vertex.size = 5, vertex.label = '')
want <- g %>%
components %>%
groups %>%
.[sapply(., length) > 3]
want
will provide the following:
$`1`
[1] 1 34 38 45 75
$`3`
[1] 3 12 24 39 50 54 58 60 67 84 97 99 100
$`5`
[1] 5 35 37 41 44 53 65 90
Then you can remove all nodes that aren't included in want
newG <- g %>%
{. - V(.)[! as.numeric(V(.)) %in% unlist(want)]}
plot(newG, vertex.size = 5, vertex.label = '')
Upvotes: 1