Reputation: 1801
I'm wondering if there's a way to quickly calculate the number of shared neighbors between any pairs of nodes (i.e., the number of nodes connected to both node i and j) from an adjacency matrix (like the one below) and then return the output in matrix format?
I've referenced these following posts, Find common neighbors of selected vertices Counting the number of neighbors in common between two vertices in R Calculate number of common neighbors in igraph R
but can't seem to find many clues to do what I want to achieve. I was told that this can be directly calculated as adjm'
*adjm
, but I am not sure if this makes sense. It will be really appreciated if someone could shed some light on this.
# create an adj. matrix
adjm <- matrix(sample(0:1, 100, replace=TRUE, prob=c(0.6,0.4)), nc=10)
# set diagonal element to 0
diag(adjm) <- 0
# making it symmetric
adjm[lower.tri(adjm)] = t(adjm)[lower.tri(adjm)]
Upvotes: 3
Views: 1386
Reputation: 25703
If the graph is undirected, then the matrix of the number of shared neighbours is A.A
, where I used .
to denote the matrix product. As others note, the correct operator in R is %*%
. Transposition is not necessary, since in the undirected case, the adjacency matrix is symmetric: A' = A
.
If the graph is directed, let's assume that A_{ij}
denotes the number of connections from i
to j
(this is igraph's interpretation).
Then the i,j
element of A'.A
gives the number of nodes that point to both i
and j
(i <- v -> j
), and A.A'
gives the number of nodes that both i
and j
point to (i -> v <- j
). These quantities are sometimes called cocitation and bibliographic coupling, respectively, and can be computed using cocitation()
and bibcoupling()
in igraph. In the undirected case, both of these functions return the same, thus you may use either.
Upvotes: 2
Reputation: 101373
I think @G5W's solution is super concise already, highly recommended!
Below is a graph theory way to solve it, maybe not that efficient:
> nb <- neighborhood(g,mindist = 1)
> outer(nb, nb, FUN = Vectorize(function(a, b) length(intersect(a, b))))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 6 1 1 4 2 2 2 5 3 1
[2,] 1 3 2 0 1 3 0 1 2 1
[3,] 1 2 4 1 2 4 1 1 3 3
[4,] 4 0 1 4 2 1 1 3 2 1
[5,] 2 1 2 2 4 2 0 1 3 2
[6,] 2 3 4 1 2 5 1 2 3 3
[7,] 2 0 1 1 0 1 2 2 1 1
[8,] 5 1 1 3 1 2 2 5 3 1
[9,] 3 2 3 2 3 3 1 3 7 3
[10,] 1 1 3 1 2 3 1 1 3 4
> t(adjm) %*% adjm
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 6 1 1 4 2 2 2 5 3 1
[2,] 1 3 2 0 1 3 0 1 2 1
[3,] 1 2 4 1 2 4 1 1 3 3
[4,] 4 0 1 4 2 1 1 3 2 1
[5,] 2 1 2 2 4 2 0 1 3 2
[6,] 2 3 4 1 2 5 1 2 3 3
[7,] 2 0 1 1 0 1 2 2 1 1
[8,] 5 1 1 3 1 2 2 5 3 1
[9,] 3 2 3 2 3 3 1 3 7 3
[10,] 1 1 3 1 2 3 1 1 3 4
Upvotes: 1
Reputation: 37641
Yes, you can get the number of shared neighbors by computing the matrix
product of adjm' and adjm. Since you are using R, adjm'*adjm
means
the component-wise product of the matrices. We want the matrix product,
so you need to use %*%. I will use that below.
To simplify the notation, I will denote adjm = A where A[i,j] is 1 if there is a link between nodes i and j (they are neighbors) and A[i,j] = 0 otherwise.
Let's compute t(A) %*% A.
The i-jth coordinate of t(A) %*% A is
(t(A) %*% A)[i,j] =
sum(t(A)[i,k] * A[k,j]) =
sum(A[k,i] * A[k,j])
All of the products in the sum are either 0 or 1.
If both A[k,i]=1 and A[k,j]=1, the product is 1,
otherwise it is zero. So (t(A)%*%A)[i,j]
is equal to the
number of different k's for which both A[k,i]=1 and
A[k,j]=1. But A[k,i]=1 means k is a neighbor of i
and A[k,j]=1 means k is a neighbor of j, so
(t(A)%*%A)[i,j]
is equal to the number of different k's
for which k is a neighbor of both i and j.
Let's try it out on your example. In order to make the results reproducible, I set the random.seed.
library(igraph)
## For reproducibility
set.seed(1492)
# create an adj. matrix
adjm <- matrix(sample(0:1, 100, replace=TRUE, prob=c(0.6,0.4)), nc=10)
# set diagonal element to 0
diag(adjm) <- 0
# making it symmetric
adjm[lower.tri(adjm)] = t(adjm)[lower.tri(adjm)]
Shared = t(adjm) %*% adjm
g = graph_from_adjacency_matrix(adjm, mode = "undirected")
plot(g)
Notice for example that Shared[1,4] = 4.
That is because nodes 1 and 4 have four shared neighbors,
Nodes 2,3,6 and 9. Shared[5,7]=0 because nodes 5 and 7
have no neighbors in common.
Upvotes: 3