Reputation: 3020
I would like to use dplyr to select certain columns that match to a string vector.
one <- seq(1:10)
two <- rnorm(10)
three <- runif(10, 1, 2)
four <- -10:-1
df <- data.frame(one, two, three, four)
vars <- c('on', 'thr')
I want to select only the columns in df whose titles start with'on' or 'thr':
dplyr::select_(df, starts_with(vars))
However, the above is not working.
Upvotes: 4
Views: 16529
Reputation: 546
Here is a solution using starts_with
:
df %>%
select(map(c('on', 'thr'),
starts_with,
vars = colnames(.)) %>%
unlist())
Basically, the idea is to apply the starts_with
function to the vector of names by using map
.
But to get it to work, one must add the argument vars
(the list of colnames), and then unlist the result of map
to get the vector of positions.
This solution expands the one of Chrisss to the case where there are several matches for at least one entry.
Upvotes: 1
Reputation: 1731
Presumably you know in advance, because you're coding it in, what column name matches you want, so you could use
select(starts_with("on"), starts_with("thr"))
Ah, I see Tony Ladson essentiall suggested this already. Depending on your exact use case, though, I don't see a need to get them from a vector.
Upvotes: 4
Reputation: 57686
The various selection helper functions in dplyr are meant to take only a single character string for matching. You can get around this by combining your strings into one regular expression and using matches
:
vars <- paste0("^(", paste(vars, collapse="|"), ")")
select(df, matches(vars))
Upvotes: 6