user2199881
user2199881

Reputation: 211

check column values and print/delete rows satisfying condition based on percent of columns

I have a matrix of values arranged in different columns per row. What I want my code to do :

  1. Iterate over a row -> check if value of Column < threshold (e.g. 1)
  2. Within the row, if there are say 80% columns satisfying that condition, Keep the row ; else remove the full row.

Code so far :

myfilt <- function(t,x){
             if ((length(which(t[x,] > 1)) / 60) >= 0.8){
               return(1)
             }else{
               return(0)
             }
          }

y=c()
for(i in 1:length(t[,1])){
   y = c(y,myfilt(t,i))
}

But when I print t[v,] all the rows have same value :( Not sure what I am doing wrong. Also if there is a shorter way to do this, let me know.

P.S. : Here 't' is the name of matrix I am testing

Upvotes: 1

Views: 676

Answers (1)

juba
juba

Reputation: 49033

Here's a way to do it :

## Parameters
threshold <- 0.8
perc.to.keep <- 0.5
## Example Matrix
set.seed(1337)
m <- matrix(rnorm(25,1,1),nrow=5,ncol=5)

#           [,1]      [,2]        [,3]       [,4]      [,5]
# [1,] 1.7122837 0.8383025 -0.02718379  2.2157099 2.1291008
# [2,] 0.2462742 2.4602621 -0.04117532 -0.6214087 1.4501467
# [3,] 1.0381899 3.0094584  0.12937698  0.9314247 1.0505864
# [4,] 2.1784211 0.9220618  1.85313022  0.9370171 0.8756698
# [5,] 0.8467962 2.3543421  0.37723981  2.0757077 1.9120115

test <- m < threshold
sel <- apply(test,1,function(v) sum(v)/length(v)) < perc
m[sel,]

#           [,1]      [,2]        [,3]      [,4]      [,5]
# [1,] 1.7122837 0.8383025 -0.02718379 2.2157099 2.1291008
# [2,] 1.0381899 3.0094584  0.12937698 0.9314247 1.0505864
# [3,] 2.1784211 0.9220618  1.85313022 0.9370171 0.8756698
# [4,] 0.8467962 2.3543421  0.37723981 2.0757077 1.9120115

Upvotes: 2

Related Questions