Peter Chung
Peter Chung

Reputation: 1122

compare rows of matrix to another

I have two matrixes, one is from experiment (df1) and another is reference (df2). They are semi-quantitative values from specimens, from 1 to 50. I would like to compare each rows of df1 from experiment whether the values are all True (as same as) to the reference.

df1:

      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    6   14   32   38   40   48
 [2,]    1   12   17   20   36   47
 [3,]    7   15   29   33   40   42
 [4,]    7   13   28   33   35   48
 [5,]    1    2   13   36   38   41
 [6,]   12   20   37   38   41   48
 [7,]   13   14   28   34   36   43
 ...more rows

 df2:
       [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    5   12   14   15   24   32
 [2,]    4    5   13   22   34   47
 [3,]    1   14   24   29   34   36
 [4,]    7   13   28   33   35   48
 [5,]   13   14   28   34   36   43
 [6,]    4   10   13   17   29   30
 [7,]    4   15   22   30   36   43
 [8,]    1   11   18   36   41   48
 [9,]   14   17   18   24   43   47
[10,]   13   24   32   34   41   47
...more rows

desired output:
 V1  V2   V3   V4   V5   V6   V7
 7   13   28   33   35   48   TRUE
13   14   28   34   36   43   TRUE

How can I compare all the rows of a matrix with another matrix to sort all TRUE rows? Thanks.

Upvotes: 1

Views: 92

Answers (2)

Roasty247
Roasty247

Reputation: 729

An alternative method using for() which() and %in%:

# For reproducibility these random matrices usually have >1 match for example
# Run again if not.
data1 <- matrix(sample(c(0,1),60, replace = TRUE),ncol = 5)
data2 <- matrix(sample(c(0,1),60, replace = TRUE),ncol = 5)


# You can use some 'helper' character string vectors
data1.str <- apply(data1, 1, paste0, collapse="")
data2.str <- apply(data2, 1, paste0, collapse="")
data.match <- c()
for(i in 1:length(data1.str)){
  data.match <- append(data.match, which(data1.str %in% data2.str[i]))
} 
# Gives your matched rows already
data1[data.match,]

# For completeness to give desired output:
matched <- as.data.frame(data1)
matched$data.match <- rep(FALSE,nrow(matched))
matched$data.match[data.match] <- TRUE

> matched[which(matched$data.match == TRUE),]
   V1 V2 V3 V4 V5 data.match
4   1  1  0  0  1       TRUE
6   0  1  1  1  1       TRUE
7   1  1  0  0  0       TRUE
9   0  0  0  0  0       TRUE
10  0  1  0  0  1       TRUE

Upvotes: 1

Shree
Shree

Reputation: 11140

Here's one way of doing this -

x <- matrix(1:4, nrow=2)

     [,1] [,2]
[1,]    1    3
[2,]    2    4

y <- matrix(c(1,2,5,4), nrow=2)

     [,1] [,2]
[1,]    1    5
[2,]    2    4

do.call(paste, as.data.frame(x)) %in% do.call(paste, as.data.frame(y))

FALSE  TRUE

I am guessing this should be faster than doing inner_join by all columns.

Upvotes: 1

Related Questions