Identifying similar items from a matrix in r

Question

I have a matrix like this:

a <- c(0,45,19,48,28,19,0,0,62,3,61,62,0,0,0,63,29,0,0,0,0,0,62,63,0,0,0,0,0,29,0,0,0,0,0,0)
mat1 <- matrix(a,6,6,byrow = TRUE)
mat1
> mat1
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    0   45   19   48   28   19
[2,]    0    0   62    3   61   62
[3,]    0    0    0   63   29    0
[4,]    0    0    0    0   62   63
[5,]    0    0    0    0    0   29
[6,]    0    0    0    0    0    0

Now, if any cell has a value less than 30, it means that the corresponding row and column are the same/similar. For example [1,3] is 19, hence we say they are similar.

So for each row, we calculate the combinations that are similar(i.e less than 30 in the cell).

Row 1 : [1,3],[1,5],[1,6]

Row 2 : [2,4]

Row 3 : [3,6]

Row 4 : 0

Row 5: [5,6]

Row 6 : 0

So the total similar combinations are [1,3], [1,5], [1,6], [2,4], [3,6] and [5,6]. The result should show the total similar combinations without the transitive items which mean the total similar items should be only 2 because [1,3] , [1,5] , [1,6] , [5,6],[3,6] are the same/similar, so the count of these should be 1 and combinations [2,4] should be 1. Hence, total same/similar are 2 for this matrix.

There are multiple matrices of order nxm hence the solution desired should be dynamic according to the number of rows and columns.

rdodhia · Accepted Answer

This will output a list of combinations.

x=data.table(which(mat1<30 & mat1>0,arr.ind=T))
setkey(x,row)
x=x[!(row==col)]

s=list()
for(j in unique(x$row)){
  s[j]=list(NULL)
  temp=x[row==j,col]
  for(i in temp){
    s[[j]]=cbind(s[[j]],c(j,i))
    for(k in x[row==i,col])
      if(k %in% c(temp,j)) s[[j]]=cbind(s[[j]],c(i,k))
    x=x[!(row==i & col %in% c(temp,j))]
}}

s

Identifying similar items from a matrix in r

Answers (2)

Related Questions