dpel
dpel

Reputation: 2123

Remove rows containing string in any vector in data frame

I have a data frame containing a number of vectors that contain strings I would like to remove rows that contain a certain string.

df <- data.frame(id=seq(1:10),
             foo=runif(10),
             sapply(letters[1:5],function(x) {sample(letters,10,T)} ),
             bar=runif(10))

This can be done on a single vector by specifying the vector name i.e.

df <- df[!grepl("b", df$a),]

which I can then repeat specifying each vector e.g.

df <- df[!grepl("b", df$b),]
df <- df[!grepl("b", df$c),]
df <- df[!grepl("b", df$d),]
df <- df[!grepl("b", df$e),]

but is it possible to do it in one line without having to specify which columns contain the string? Something like:

df <- df[!grepl("b", df),]

Upvotes: 4

Views: 3188

Answers (3)

zx8754
zx8754

Reputation: 56054

Paste columns then grepl:

df[!grepl("b", paste0(df$a, df$b, df$c, df$d, df$e)), ]

Identify factor (or character columns) then paste:

df[!grepl("b", 
          apply(df[, sapply(df, class) == "factor"], 1, paste0, collapse = ",")), ]

Upvotes: 4

RHertel
RHertel

Reputation: 23788

You could try

df[-which(df=="b", arr.ind=TRUE)[,1],]

or, as suggested by @docendodiscimus

df[rowSums(df == "b") == 0,]

This second option is preferable because it does not lead to any difficulty if no matching pattern is found.

Upvotes: 6

inscaven
inscaven

Reputation: 2584

target_cols <- c("a", "b", "c", "d", "e")
df[!Reduce(`|`, lapply(df[,target_cols], function(col) grepl("b", col))),]

Upvotes: 1

Related Questions