user2199881
user2199881

Reputation: 211

Using apply function to obtain only those rows that passes threshold in R

I am trying to apply a filter on my data (which is in the form of matrix) with say 10 columns, 200 rows.

I want to retain only those rows that where the coefficient of variance is greater than a threshold. But with the code I have, it seems its printing the coefficient of variance for the rows passing threshold. I want it to just test if it passes threshold, but print the original data point in the matrix.

covar <- function(x) ( sd(x)/mean(x) )
evar <- apply(myMatrix,1,covar)
myMatrix_filt_var <-myMatrix[evar>2,]

Here threshold I set is 2.

What am I doing wrong ? Sorry just learning R.

Thanks!

Upvotes: 0

Views: 805

Answers (1)

Arun
Arun

Reputation: 118849

If m is your matrix, then,

m[apply(m, 1, function(x) sd(x)/mean(x) > 2), ]

should give you the filtered matrix. The idea is to obtain the coefficient of variation for every row and check if it is > 2 inside. This will return a logical vector from which by directly accessing it like m[logical_vector, ], we can get those rows where the condition is TRUE.

You can use na.rm = TRUE if you want to remove NA values while calculating sd and mean.

Upvotes: 1

Related Questions