Create matrix based upon group membership

Question

I would like to create a matrix that indicates group membership from a dataframe. For example, a NxN matrix where 1 means a neighborhood is within the same city as another neighborhood and 0 means the neighborhoods are part of a different city. For example:

hoodid <- c(1:10) 
cityid <- c(1, 1, 1, 2, 2, 3, 3, 3, 3, 3)
df <- data.frame(hoodid, cityid)
df

#    hoodid cityid
# 1       1      1
# 2       2      1
# 3       3      1
# 4       4      2
# 5       5      2
# 6       6      3
# 7       7      3
# 8       8      3
# 9       9      3
# 10     10      3

The desired outcome is:

# 0 1 1 0 0 0 0 0 0 0
# 1 0 1 0 0 0 0 0 0 0
# 1 1 0 0 0 0 0 0 0 0 
# 0 0 0 0 1 0 0 0 0 0
# 0 0 0 1 0 0 0 0 0 0 
# 0 0 0 0 0 0 1 1 1 1
# 0 0 0 0 0 1 0 1 1 1 
# 0 0 0 0 0 1 1 0 1 1 
# 0 0 0 0 0 1 1 1 0 1 
# 0 0 0 0 0 1 1 1 1 0

Frank · Accepted Answer

This works:

library(Matrix)
m = do.call(bdiag, lapply(
  lengths(split(df$cityid, df$cityid)), 
  function(n) 1 - diag(n)
))

# 10 x 10 sparse Matrix of class "dgCMatrix"
#                          
#  [1,] . 1 1 . . . . . . .
#  [2,] 1 . 1 . . . . . . .
#  [3,] 1 1 . . . . . . . .
#  [4,] . . . . 1 . . . . .
#  [5,] . . . 1 . . . . . .
#  [6,] . . . . . . 1 1 1 1
#  [7,] . . . . . 1 . 1 1 1
#  [8,] . . . . . 1 1 . 1 1
#  [9,] . . . . . 1 1 1 . 1
# [10,] . . . . . 1 1 1 1 .

This assumes that your data is sorted by cityid first and doesn't have duplicates or any other oddities.

You can as.matrix(m) if you want a vanilla matrix.

Create matrix based upon group membership

Answers (2)

Related Questions