Reputation: 359
I got a dataframe where there is gene expression data
I'm trying to extract all rows where ANY of the columns has a value (data is already in log2 values) >= 2 but can't seem to get there. My data is:
A B C D
Gene1 1 2 3 1
Gene2 2 1 1 4
Gene3 1 1 0 1
Gene4 1 2 0 1
I would only like to retain gene1
, gene2
and gene4
without stating all columns (as this is just a toy example).
Upvotes: 0
Views: 540
Reputation: 886948
You could use rowSums
on a logical matrix derived from df >=2
and double negate (!
) to get the index of rows to subset.
df[!!rowSums(df >=2),]
# A B C D
#Gene1 1 2 3 1
#Gene2 2 1 1 4
#Gene4 1 2 0 1
Or using the reverse condition df <2
to get the logical matrix, userowSums
, then check whether this is less than ncol(df)
df[rowSums(df <2) < ncol(df),]
# A B C D
#Gene1 1 2 3 1
#Gene2 2 1 1 4
#Gene4 1 2 0 1
Or
df[apply(t(df>=2),2, any), ]
df <- structure(list(A = c(1L, 2L, 1L, 1L), B = c(2L, 1L, 1L, 2L),
C = c(3L, 1L, 0L, 0L), D = c(1L, 4L, 1L, 1L)), .Names = c("A",
"B", "C", "D"), class = "data.frame", row.names = c("Gene1",
"Gene2", "Gene3", "Gene4"))
Upvotes: 1