Reputation: 595
My goal is to select several columns from mydata that contain certain patterns.
mydata <- data.frame(q1 = rnorm(10), q10 = rnorm(10), q12 = rnorm(10), q20 = rnorm(10))
Method 1 - using grep - does what I need in a parsimonious way:
myvars <- names(mydata)[grep("^q10|^q12", names(mydata))]
temp <- mydata[myvars]
tbl_df(temp)
I am trying to do do it purely in dplyr. However, I am not finding anything more parsiminious (like in grep) than:
temp <- cbind(select(mydata, starts_with("q10")), select(mydata, starts_with("q12")))
tbl_df(temp)
It's too much code. How could I make it work with an "|"? I tried the following but none of them work:
select(mydata, starts_with("q10|q12"))
select(mydata, starts_with(c("q10","q12")))
temp <- select(mydata, starts_with("q10","q12"))
select(mydata, starts_with(c("q10"))|starts_with(c("q12")))
Advice? Thank you!
Upvotes: 2
Views: 1007
Reputation: 99331
From the select()
help file, I gather that the only special internal function that accepts a regular expression is matches()
. You can use the regular expression ^q1(0|2)
to start at the beginning of the name and match q1
with 0
or 2
following.
select(mydata, matches("^q1(0|2)"))
# q10 q12
# 1 -0.97766671 1.2691732
# 2 -1.17397582 -0.8175758
# 3 -1.98684643 0.1117284
# 4 1.12142980 0.5737528
# 5 0.41680505 0.8974448
# 6 1.47558382 -1.5122752
# 7 0.39651297 -0.5282083
# 8 -0.13266148 0.8281671
# 9 -0.66982395 0.1239249
# 10 0.06119857 -0.3484675
Upvotes: 5