Haakonkas
Haakonkas

Reputation: 1041

R: Extract column names conditionally based on column content

I have a data frame like this:

structure(list(id = 1:10, emrA = c("I219V, T286A", "1", "0", 
"I219V", "I219V", "R164H, I219V", "R164H, I219V", "R164H, I219V", 
"R164H, I219V", "R164H, I219V"), gyrA_8 = c("S83L,678E", "D87N", 
"S83L,252G", "S83L,678E", "S83L,678E", "S83L,828T", "S83L,828T", 
"S83L,828T", "S83L,828T", "S83L,828T"), emrY = c("0", "1", "1", 
"1", "1", "1", "1", "1", "1", "1"), T_CIP = c(0.25, 0.12, 0.12, 
0.25, 0.25, 0.5, 2, 1, 1, 2)), .Names = c("id", "emrA", "gyrA_8", 
"emrY", "T_CIP"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L))

Which looks like this:

     id emrA         gyrA_8    emrY  T_CIP
      1 I219V, T286A S83L,678E 0     0.25
      2 1            D87N      1     0.12
      3 0            S83L,252G 1     0.12
      4 I219V        S83L,678E 1     0.25
      5 I219V        S83L,678E 1     0.25
      6 R164H, I219V S83L,828T 1     0.5
      7 R164H, I219V S83L,828T 1     2
      8 R164H, I219V S83L,828T 1     1
      9 R164H, I219V S83L,828T 1     1
     10 R164H, I219V S83L,828T 1     2

I want to create a vector with the names of the columns which does NOT contain ONLY 1/0.

What I want to get:

vector: c("id", "emrA", "gyrA_8", "T_CIP")

I have tried to use something like this without luck:

lapply(df, function(x) colnames(which(grepl("regex_pattern", x) == TRUE))

Upvotes: 2

Views: 495

Answers (2)

akrun
akrun

Reputation: 886938

We can also use Filter

Filter(function(x) !all(x %in% c(0, 1)), df)

Upvotes: 0

PKumar
PKumar

Reputation: 11128

You can do this, if df is your dataframe then:

output <- df[,!sapply(df, function(x) all(x %in% c(0,1)))]

This will give you those columns of which doesn't contain either 0 or 1 only.

if you want the column names then you can do names(output) or colnames(output)

Upvotes: 1

Related Questions