J.Q
J.Q

Reputation: 1031

removing vectors from data frame by suffix in r

Some vectors in data frame have include the suffix _rc_1. I want to remove these vectors from the data frame. I've tried several options and get errors that show I'm misunderstanding something. For example:

library(dplyr)
newdata <- subset(mydata, -contains("_rc_1"))
Error: No tidyselect variables were registered

I'm agnostic to how I solve the problem.

Perhaps this is done best with grepl() and a regular expression, but I'm struggling to implement a version that performs as planned here as well.

Upvotes: 1

Views: 557

Answers (2)

IceCreamToucan
IceCreamToucan

Reputation: 28695

In base R you can use grepl to get a logical vector with length equal to ncol(mydata) which is TRUE for column names ending in _rc_1 (the $ ensures that _rc_1 comes at the end). Then after swapping the TRUEs and FALSEs with !, you can subset your data frame using [].

newdata <- mydata[!grepl('_rc_1$', names(mydata))]

Upvotes: 1

akrun
akrun

Reputation: 887223

contains work with dplyr If we need to use subset (a base R function), use grep which can take regex pattern and return either a numeric index or the column names itself as select argument in subset can take both as valid inputs

subset(mydata, select = grep("_rc_1", names(mydata), value = TRUE, invert = TRUE))

Also, there is startsWith/endsWith in base R for prefix/suffix matches

subset(mydata, select = names(mydata)[!endsWith(names(mydata), "_rc_1")])

In dplyr, the select_helpers - contains works with select

library(dplyr)
mydata %>%
   select(-contains("_rc_1"))

Reproducible with built-in dataset 'iris'

data(iris)
head(subset(iris, select = names(iris)[!endsWith(names(iris), "Length")]))
iris %>%  
    select(-contains('Sepal')) %>%
    head

Upvotes: 1

Related Questions