Reputation: 1031
Some vectors in data frame have
include the suffix _rc_1
. I want to remove these vectors from the data frame. I've tried several options and get errors that show I'm misunderstanding something. For example:
library(dplyr)
newdata <- subset(mydata, -contains("_rc_1"))
Error: No tidyselect variables were registered
I'm agnostic to how I solve the problem.
Perhaps this is done best with grepl()
and a regular expression, but I'm struggling to implement a version that performs as planned here as well.
Upvotes: 1
Views: 557
Reputation: 28695
In base R you can use grepl
to get a logical vector with length equal to ncol(mydata)
which is TRUE
for column names ending in _rc_1
(the $ ensures that _rc_1 comes at the end). Then after swapping the TRUE
s and FALSE
s with !
, you can subset your data frame using []
.
newdata <- mydata[!grepl('_rc_1$', names(mydata))]
Upvotes: 1
Reputation: 887223
contains
work with dplyr
If we need to use subset
(a base R
function), use grep
which can take regex pattern and return either a numeric index or the column names itself as select
argument in subset
can take both as valid inputs
subset(mydata, select = grep("_rc_1", names(mydata), value = TRUE, invert = TRUE))
Also, there is startsWith/endsWith
in base R
for prefix/suffix matches
subset(mydata, select = names(mydata)[!endsWith(names(mydata), "_rc_1")])
In dplyr
, the select_helpers
- contains
works with select
library(dplyr)
mydata %>%
select(-contains("_rc_1"))
Reproducible with built-in dataset 'iris'
data(iris)
head(subset(iris, select = names(iris)[!endsWith(names(iris), "Length")]))
iris %>%
select(-contains('Sepal')) %>%
head
Upvotes: 1