Reputation: 1973
This should be a simple issue but I am struggling.
I have a vector of variable names that I want to exclude from a data frame:
df <- data.frame(matrix(rexp(50), nrow = 10, ncol = 5))
names(df) <- paste0(rep("variable_", 5), 1:5)
excluded_vars <- c("variable_1", "variable_3")
I would have thought that just excluding the object in the select statement with -
would have worked:
select(df, -excluded_vars)
But I get the following error:
Error in -excluded_vars : invalid argument to unary operator
the same is true when using select_()
Any ideas?
Upvotes: 32
Views: 80770
Reputation: 1044
You are almost there just use -c()
in the excluded_vars
.
Like this:
select(df, -c(excluded_vars))
Upvotes: 9
Reputation: 11
How about this? There is a need to pre-build the column list vector and you'll have to rename the column aligned to its actual order, but it might work?
cc1 <- c("id")
nm <- names(df)
cc2 <- setdiff(nm, cc1)
select(df, .cols=c(everything(), -cc1)) %>% rename_with(~ cc2)
Upvotes: 1
Reputation: 21
The select(... -one_of())
method was giving me an error
(unused argument (-one_of(excluded_vars))
df[, -which(names(df) %in% excluded_vars)]
worked for me instead (R 4.0.3)
Upvotes: 0
Reputation: 6220
select(df, -any_of(excluded_vars))
is now the safest way to do this (the code will not break if a variable name that doesn't exist in df is included in excluded_vars
)
Upvotes: 15
Reputation: 44
Just simply use the the negation operator as:
select(df, !c(col1, col2, col3))
Upvotes: -1
Reputation: 9
You can write:
df %>% dplyr::select(colname)
Some packages also have select function and this may be the problem, that's why you need to mention package.
Upvotes: 0
Reputation: 1973
As of a more recent version of dplyr, the following now works:
select(df, -excluded_vars)
Upvotes: 13
Reputation: 5191
You need to use the one_of
function:
select(df, -one_of(excluded_vars))
See the section on Useful Functions in the dplyr
documentation for select for more about selecting based on variable names.
Upvotes: 32
Reputation: 5893
With select_
, you could simply use setdiff
.
select_(df, .dots = setdiff(colnames(df), excluded_vars))
Upvotes: 1