JD Long
JD Long

Reputation: 60746

Sort one matrix based on another matrix

I'm trying to put the rows of one matrix in the same order as the rows of another matrix of the same dimension. However I can't quite figure out how to do this without an explicit loop. It seems I should be able to do this with subsetting and an apply or Map function, but I can't figure out how to do it.

Here's a toy example:

sortMe <- matrix(rnorm(6), ncol=2)
sortBy <- matrix(c(2,1,3, 1,3,2), ncol=2)

sorted <- sortMe 
for (i in 1:ncol(sortMe)) {
  sorted[,i] <- sortMe[,i][sortBy[,i]]
}

Using this method, the resulting sorted matrix contains the values from sortMe sorted in the same order as the sortBy matrix. Any idea how I'd do this without the loop?

Upvotes: 5

Views: 2851

Answers (3)

bdemarest
bdemarest

Reputation: 14667

I'm going to suggest that you stick you your original version. I would argue that the original loop you wrote is somewhat easier to read and comprehend (also probably easier to write) than the other solutions offered.

Also, the loop is nearly as fast as the other solutions: (I borrowed @Josh O'Brien's timing code before he removed it from his post.)

set.seed(444)
n = 1e7
sortMe <- matrix(rnorm(2 * n), ncol=2)
sortBy <- matrix(c(sample(n), sample(n)), ncol=2)

#---------------------------------------------------------------------------
# @JD Long, original post.
system.time({
    sorted_JD <- sortMe
    for (i in 1:ncol(sortMe)) {
        sorted_JD[, i] <- sortMe[, i][sortBy[, i]]
    } 
})
#   user  system elapsed 
#  1.190   0.165   1.334 

#---------------------------------------------------------------------------
# @Julius (post is now deleted).
system.time({
    sorted_Jul2 <- sortMe
    sorted_Jul2[] <- sortMe[as.vector(sortBy) + 
        rep(0:(ncol(sortMe) - 1) * nrow(sortMe), each = nrow(sortMe))]
})
#   user  system elapsed 
#  1.023   0.218   1.226 

#---------------------------------------------------------------------------
# @Josh O'Brien
system.time({
    sorted_Jos <- sortMe
    sorted_Jos[] <- sortMe[cbind(as.vector(sortBy), as.vector(col(sortBy)))]
})
#   user  system elapsed 
#  1.070   0.217   1.274 

#---------------------------------------------------------------------------
# @Justin
system.time({
    sorted_Just = matrix(unlist(lapply(1:2,
        function(n) sortMe[,n][sortBy[,n]])), ncol=2)
})
#   user  system elapsed 
#  0.989   0.199   1.162 


all.equal(sorted_JD, sorted_Jul2)
# [1] TRUE
all.equal(sorted_JD, sorted_Jos)
# [1] TRUE
all.equal(sorted_JD, sorted_Just)
# [1] TRUE

Upvotes: 3

Josh O&#39;Brien
Josh O&#39;Brien

Reputation: 162321

This (using a two-column integer matrix to index the matrix's two dimensions) should do the trick:

sorted <- sortMe
sorted[] <- sortMe[cbind(as.vector(sortBy), as.vector(col(sortBy)))]

Upvotes: 10

Justin
Justin

Reputation: 43255

Using lapply would work.

matrix(unlist(lapply(1:2, function(n) sortMe[,n][sortBy[,n]])), ncol=2)

But there is probably a more efficient way...

Upvotes: 3

Related Questions