Reputation: 7107
I have some data which looks like:
data(iris)
iris %>%
select(Species, everything()) %>%
rename(Y = 1) %>%
rename_at(vars(-c(1)), ~str_c("X", seq_along(.)))
Data:
Y X1 X2 X3 X4
1 setosa 5.1 3.5 1.4 0.2
2 setosa 4.9 3.0 1.4 0.2
3 setosa 4.7 3.2 1.3 0.2
4 setosa 4.6 3.1 1.5 0.2
5 setosa 5.0 3.6 1.4 0.2
6 setosa 5.4 3.9 1.7 0.4
I add a random variable:
d$noise <- rnorm(length(d))
I am trying to extract just the Y, X1, X2... XN
variables (dynamically). What I currently have is:
d %>%
select("Y", cat(paste0("X", seq_along(2:ncol(.)), collapse = ", ")))
This doesn't work since it takes into account the noise
column and doesn't work even without the noise
column.
So I am trying to create a new data frame which just extracts the Y, X1, X2...XN
columns.
Upvotes: 1
Views: 45
Reputation: 1268
dplyr provides two select helper functions that you could use --- contains
for literal strings or matches
for regular expressions.
In this case you could do
d %>%
select("Y", contains("X"))
or
d %>%
select("Y", matches("X\\d+"))
The first one works in the example you provided but would fail if you have other variables that contain any "X" character. The second is more robust in that it will only capture variables whose names are "X" followed by one or more digits.
Upvotes: 3