user1973511
user1973511

Reputation: 21

Subset selection from binary matrix with dynamic column indices

A few questions, for which the R language might have elegant solutions....

Given, a matrix m containing binary values 1 and 0, and a vector v of column indices

  1. how would I write a function to extract the all rows in m that have the value of 1 in each of the columns indexed by the integers in v?
  2. as an extra feature, how would one return the row indices along with the corresponding rows?

Probably best if I illustrating, with an example....

Assuming the logic I'm asking for resides in function selectByIndices( matrix, indexVector).

so if we have the matrix (or perhaps the equivalent dataframe):

 >(m= matrix(c( 1, 0, 1, 1, 1,0, 1, 1, 0, 1,1, 0, 1, 1, 0,1, 1, 1, 
   0, 1,0, 1, 0, 0, 1), 5))

         [,1] [,2] [,3] [,4] [,5]
  [1,]    1    0    1    1    0
  [2,]    0    1    0    1    1
  [3,]    1    1    1    1    0
  [4,]    1    0    1    0    0
  [5,]    1    1    0    1    1

and index vectors:

 >c1 = c(1,3,4)
 >c2 =  c(4,5)
 >c3 =  c(1,3,5)

The function would behave something like this:

 >selectByIndices( m, c1)

        [,1] [,2] [,3] [,4] [,5]
  [1,]    1    0    1    1    0
  [3,]    1    1    1    1    0


 >selectByIndices( m, c2)

        [,1] [,2] [,3] [,4] [,5]
  [2,]    0    1    0    1    1
  [5,]    1    1    0    1    1


 >selectByIndices( m, c3)

    #no rows (i.e. empty collection) returned

Hoping it's clear enough, thanks in advance for your help.

Upvotes: 1

Views: 588

Answers (2)

IRTFM
IRTFM

Reputation: 263411

> selectRows <- function(mat, rown) suppressWarnings(mat[apply( mat[, rown], 1, all) , ])
> selectRows(m, c1)
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    0    1    1    0
[2,]    1    1    1    1    0

>  whichRows <-function(mat, rown) suppressWarnings( which( apply( mat[, rown], 1, all) ) )
> whichRows(m, c1)
[1] 1 3

Upvotes: 0

Josh O&#39;Brien
Josh O&#39;Brien

Reputation: 162401

## Create a function that extracts the qualifying rows
f <- function(m, j) {
    m[rowSums(m[, j]) == length(j),]
    # m[apply(m[, j], 1, function(X) all(X==1)),] ## This would also work
    # which(rowSums(m[, j]) == length(j))         ## & this would get row indices
}

## Try it out
f(m, c1)
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    1    0    1    1    0
# [2,]    1    1    1    1    0

f(m, c2)
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    0    1    0    1    1
# [2,]    1    1    0    1    1

Upvotes: 2

Related Questions