Reputation: 2572
I currently have a 1000x1000 adjacency matrix. I am trying to see if there are "clusters" in the data as a whole. For example, if I were to hypothetically print it out onto a piece of paper with small enough 0 and 1's, then we can see a pattern where the 1's would congregate.
I am wondering if there is perhaps a way to do this in R, either by shrinking the entire matrix into a small enough grid to read, or some visualization method like a heatmap. I am wary of a heatmap as it seems it is using a similarity measure to calculate things. I instead want to just have a bird's-eye view of where the 0 and 1's might cluster.
The data I have is essentially an adjacency matrix created by igraph
in R via the Stochastic Block Model function sample_sbm
. It contains 1000 nodes, with 100 communities. A reproducible example is as follows:
library(igraph)
pref.matrix <- matrix(rep(0.07, 100*100), ncol = 100)
diag(pref.matrix) <- rep(0.01, 100)
g <- sample_sbm(1000, pref.matrix = pref.matrix, block.sizes = rep(10, 100))
A <- as.matrix(as_adj(g)) # The adjacency matrix of 1000 by 1000
I am wondering how I can get a visualization of this matrix so that I can see if the 100 groups really do appear to cluster?
Upvotes: 1
Views: 120
Reputation: 37661
As in Your previous question,
heatmap
should work for this. I think that the reason that you are unhappy
with the result is because of the way you are generating the graph. The heatmap
will show this.
Using your matrix A, you can make a heatmap with:
heatmap(A, Rowv=NA, Colv=NA, col=terrain.colors(16),
labRow=FALSE, labCol=FALSE, revC=TRUE)
A 1000 x 1000 image is too big to see much detail, so I will enlarge the upper left corner - the first 10 groups of 10.
heatmap(A[1:100, 1:100], Rowv=NA, Colv=NA, col=terrain.colors(16),
labRow=FALSE, labCol=FALSE, revC=TRUE)
This starts to show what is going on, but it is not what I think you intended. But this is based on the way that you made the preference matrix.
pref.matrix <- matrix(rep(0.07, 100*100), ncol = 100)
diag(pref.matrix) <- rep(0.01, 100)
You start out with all entries being 0.07 - i.e. a fairly low probability of a connection between any groups. Then you alter the diagonal and set it to be 0.01 - quite a bit lower. What you are requesting is that there is a small chance of any two different groups being connected, and very low chance of a node being connected to nodes within the same group. I suspect that this is not what you intended. If you take a point in some group, the probability of being connected to all points outside its group is equal - that should not create any clusters. But the probability of being connected inside the point's group is lower. No clusters there either. I think what you wanted was that it should be more likely for a point to connect to a point within its group than it is to connect to a different group.
So maybe you needed something like this:
pref.matrix <- matrix(rep(0.04, 100*100), ncol = 100)
diag(pref.matrix) <- rep(0.4, 100)
g2 <- sample_sbm(1000, pref.matrix = pref.matrix, block.sizes = rep(10, 100))
A2 <- as.matrix(as_adj(g2))
This makes it much more likely for a point to connect to its own group than to another group. You can see this in the enlarged heatmap.
heatmap(A2[1:100, 1:100], Rowv=NA, Colv=NA, col=terrain.colors(16),
labRow=FALSE, labCol=FALSE, revC=TRUE)
Now there are groups forming down the diagonal.
Upvotes: 2
Reputation: 226741
How about
A0 <- as_adj(g) ## leave as sparse (dgCMatrix)
library(Matrix)
image(A0)
... ?
Upvotes: 1