Reputation: 259
Suppose I have a dataset in R indicating the membership of countries in International Organizations (the original dataset can be found here: IGO_stateunit_v2.3.zip).
Here is an example for the basic structure of the data:
cntr <- c('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J')
UNO <- c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1)
APEC <- c(0, 0, 0, 0, 1, 1, 1, 0, 0, 0)
ASEAN <- c(0, 0, 0, 0, 1, 1, 0, 0, 0, 0)
data <- data.frame(cntr, UNO, APEC, ASEAN)
So the data looks like this, where 1=membership in an organization:
cntr UNO APEC ASEAN
A 0 0 0
B 1 0 0
C 1 0 0
D 1 0 0
E 1 1 1
F 1 1 1
G 1 1 0
H 1 0 0
I 1 0 0
J 1 0 0
What I would like to create with this data in R is a matrix that counts the number of memberships two countries share together. The result should look like this:
cntr A B C D E F G H I J
A 0 0 0 0 0 0 0 0 0 0
B 0 0 1 1 1 1 1 1 1 1
C 0 1 0 1 1 1 1 1 1 1
D 0 1 1 0 1 1 1 1 1 1
E 0 1 1 1 0 3 2 1 1 1
F 0 1 1 1 3 0 2 1 1 1
G 0 1 1 1 2 2 0 1 1 1
H 0 1 1 1 1 1 1 0 1 1
I 0 1 1 1 1 1 1 1 0 1
J 0 1 1 1 1 1 1 1 1 0
Has anyone an idea how to calculate the connectivity matrix? Help would be greatly appreciated!
Upvotes: 3
Views: 209
Reputation: 13314
Your data:
d <- structure(list(cntr = structure(1:10, .Label = c("A", "B", "C",
"D", "E", "F", "G", "H", "I", "J"), class = "factor"), UNO = c(0L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), APEC = c(0L, 0L, 0L, 0L,
1L, 1L, 1L, 0L, 0L, 0L), ASEAN = c(0L, 0L, 0L, 0L, 1L, 1L, 0L,
0L, 0L, 0L)), .Names = c("cntr", "UNO", "APEC", "ASEAN"), class = "data.frame", row.names = c(NA,
-10L))
Solution:
m <- as.matrix(d[,-1])
m2 <- m %*% t(m)
# Alternatively, m2 <- tcrossprod(m) can be used with the same result (thanks to @akrun for pointing that out)!
diag(m2) <- 0
dimnames(m2) <- list(LETTERS[1:10],LETTERS[1:10])
m2
# A B C D E F G H I J
# A 0 0 0 0 0 0 0 0 0 0
# B 0 0 1 1 1 1 1 1 1 1
# C 0 1 0 1 1 1 1 1 1 1
# D 0 1 1 0 1 1 1 1 1 1
# E 0 1 1 1 0 3 2 1 1 1
# F 0 1 1 1 3 0 2 1 1 1
# G 0 1 1 1 2 2 0 1 1 1
# H 0 1 1 1 1 1 1 0 1 1
# I 0 1 1 1 1 1 1 1 0 1
# J 0 1 1 1 1 1 1 1 1 0
EDIT: slightly more compact solution:
rownames(d) <- d$cntr
m <- tcrossprod(as.matrix(d[,-1]))
diag(m) <- 0
m
Upvotes: 6