Extract rows considering a condition in a data frame in R

Question

I want to extract the rows if all the columns are zero. In the first column of df, there is a list of gene IDs that are repeated per individuals in the second column. I want to extract gene IDs if the df[,3:length(df)] are all zero for all individuals in the next column.

> dim(df)
[1] 1040675      56

> df
ID     INDV     tra1   tr2   tr3  tra2   tr15   tr1b  
ENS777   1       1.2    0     0   1.6    3.3    0
ENS777   2       1.2    0     0   1.6    3.3    0
ENS777   3       1.2    0     0   1.6    3.3    0
ENS777   4       1.2    0     0   1.6    3.3    0
ENS999   1        0     0     0    0      0     0
ENS999   2        0     0     0    0      0     0
ENS999   3        0     0     0    0      0     0
ENS999   4        0     0     0    0      0     0
ENS888   1       1.2    0     0   1.6    3.3    0
ENS888   2       1.2    0     0   1.6    3.3    0
ENS888   3       1.2    0     0   1.6    3.3    0
ENS888   4       1.2    0     0   1.6    3.3    0

So, the out put would be ENS999 in this case.

Ronak Shah · Accepted Answer

Using dplyr -

library(dplyr)

df %>%
  group_by(ID) %>%
  filter(all(unlist(dplyr::select(cur_data(), tra1:tr1b) == 0))) %>%
  ungroup %>%
  distinct(ID)

#   ID    
#   
#1 ENS999

Extract rows considering a condition in a data frame in R

Answers (2)

Related Questions