vds
vds

Reputation: 359

Select rows with certain value in any of the columns

I got a dataframe where there is gene expression data

I'm trying to extract all rows where ANY of the columns has a value (data is already in log2 values) >= 2 but can't seem to get there. My data is:

      A B C D
Gene1 1 2 3 1
Gene2 2 1 1 4
Gene3 1 1 0 1
Gene4 1 2 0 1

I would only like to retain gene1, gene2 and gene4 without stating all columns (as this is just a toy example).

Upvotes: 0

Views: 540

Answers (1)

akrun
akrun

Reputation: 886948

You could use rowSums on a logical matrix derived from df >=2 and double negate (!) to get the index of rows to subset.

df[!!rowSums(df >=2),]
#      A B C D
#Gene1 1 2 3 1
#Gene2 2 1 1 4
#Gene4 1 2 0 1

Or using the reverse condition df <2 to get the logical matrix, userowSums, then check whether this is less than ncol(df)

df[rowSums(df <2) < ncol(df),]
#     A B C D
#Gene1 1 2 3 1
#Gene2 2 1 1 4
#Gene4 1 2 0 1

Or

df[apply(t(df>=2),2, any), ]

data

df <- structure(list(A = c(1L, 2L, 1L, 1L), B = c(2L, 1L, 1L, 2L), 
 C = c(3L, 1L, 0L, 0L), D = c(1L, 4L, 1L, 1L)), .Names = c("A", 
"B", "C", "D"), class = "data.frame", row.names = c("Gene1", 
"Gene2", "Gene3", "Gene4"))

Upvotes: 1

Related Questions