vitor
vitor

Reputation: 1250

deleting rows of a data frame with specific condition

I've seen similar questions here, but I couldn't find any help.

I have a df like this:

df <- data.frame(CSF1=c(-9,-9,-9,-9), CSF2=c(-9,-1,-9,-9), 
               D13S1=c(-9,-9,11,11), D13S2=c(-9,-9,11,12))

         CSF1 CSF2 D13S1 D13S2 
10398     -9   -9   -9    -9                   
10398     -9   -1   -9    -9                             
20177     -9   -9   11    11                  
20361     -9   -9   11    12           

I want to delete all the rows with values -9 or -1 for all columns, like the first 2 rows.

Thanks!

Upvotes: 0

Views: 236

Answers (2)

CHP
CHP

Reputation: 17189

Try this (edited by Arun to account for Dov's post):

df[rowSums(df == -1 | df == -9, na.rm = TRUE) != ncol(df), ]
##   CSF1 CSF2 D13S1 D13S2
## 3   -9   -9    11    11
## 4   -9   -9    11    12

(df == -1 | df == -9) will give you logical matrix. rowSums will give you count of TRUE in each row since TRUE is evaluated as 1. The na.rm=TRUE is to ensure that rows with NA are not omitted (see Dov's post). Use resultant row numbers to subset df.

Upvotes: 2

Dov Chelst
Dov Chelst

Reputation: 46

All I will add is that the which function doesn't appear to be necessary. Removing it yields the same result.

There is a secondary problem that you would have in situations with missing data. If, you add an NA to the 3rd row (try it with df[3,4] <- NA), then the output of the above solution will omit the 3rd row as well regardless of the other entries' values. I won't suggest alternatives as this may not be a problem for your data set.

Upvotes: 3

Related Questions