Reputation: 1598
I have a square matrix of co-occurrence type of data, like:
m <- matrix(c(30, 30, 30, 30, 20, 0, 0,
30, 373, 30, 204, 207, 0, 290,
30, 30, 65, 65, 20, 35, 0,
30, 204, 65, 239, 38, 35, 156,
20, 207, 20, 38, 207, 0, 134,
0, 0, 35, 35, 0, 35, 0,
0, 290, 0, 156, 134, 0, 290),
nrow=7, byrow=TRUE)
By comparing upper-triangular + diagonal elements, there are some off-diagonals equal to diagonal. I want to remove rows and columns by satisfying:
if ((m[i,j] == m[i,i]) & (m[i,j] < m[j,j]))
Thus, leaving only the row/column that has larger occurrence, and to take out the row/column when an element always co-occur with another.
The output should be:
373 204
204 239
Thanks!
Upvotes: 2
Views: 2211
Reputation: 89057
Here is a vectorized approach:
i <- as.vector(row(m))
j <- as.vector(col(m))
k <- matrix(m == m[cbind(i, i)] & m < m[cbind(j, j)], nrow(m))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] FALSE TRUE TRUE TRUE FALSE FALSE FALSE
# [2,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# [3,] FALSE FALSE FALSE TRUE FALSE FALSE FALSE
# [4,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
# [5,] FALSE TRUE FALSE FALSE FALSE FALSE FALSE
# [6,] FALSE FALSE TRUE TRUE FALSE FALSE FALSE
# [7,] FALSE TRUE FALSE FALSE FALSE FALSE FALSE
delete.idx <- sort(unique(i[k]))
# [1] 1 3 5 6 7
keep.idx <- setdiff(seq_len(nrow(m)), delete.idx)
# [1] 2 4
m[keep.idx, keep.idx]
# [,1] [,2]
# [1,] 373 204
# [2,] 204 239
Upvotes: 2