Reputation: 14774
Let's say I want to write a function like:
Fn <- function(df, to_remove = NULL) {
df <- df[!df %in% to_remove,]
}
The purpose is to remove all values in a row (not row numbers/indices/names) where one of the values is equal to value(s) specified in to_remove.
Any idea why this doesn't work without specifying a column?
Example:
df <- data.frame(a = c("a", "a", "a"), b = c("a", "b", "a"))
a b
1 a a
2 a b
3 a a
Expected output:
a b
1 a a
3 a a
I'm looking for a base R
or data.table
solution.
Upvotes: 0
Views: 82
Reputation: 11150
To remove rows, you need to provide row indices with negative sign or vector (typically of same length as nrow(df)
) with TRUE
and FALSE
. Your code !df %in% to_remove
does not do that. Try this -
Fn <- function(df, to_remove = NULL) {
df[!apply(df, 1, function(x) any(x %in% to_remove)), ]
}
Fn(df, "b")
a b
1 a a
3 a a
Fn(df, c("a", "b"))
[1] a b
<0 rows> (or 0-length row.names)
Fn(df, "d")
a b
1 a a
2 a b
3 a a
Upvotes: 1
Reputation: 4999
Why not a simple loop?
rowrem <- function(x, val) {
for(i in 1:nrow(x)){
for(j in 1:ncol(x)){
if(paste(x[i,j]) == val)(
x <- x[-i,]
)
}
}
print(x)
}
Result
> rowrem(df1, "b")
a b
1 a a
3 a a
Explanation: What you want to do is check every single value of every single cell and refer that back to the row number. With base R your choices are a bit limited in that regard. A sensible (i.e., maintainable) solution would probably be something like above, but I'm sure someone will come up with a lapply
or subsetting solution as well.
df1 <- data.frame(a = c("a", "a", "a"), b = c("a", "b", "a"))
Upvotes: 1