matsuo_basho
matsuo_basho

Reputation: 3020

Using starts_with in dplyr with a vector of partial column names

I would like to use dplyr to select certain columns that match to a string vector.

one <- seq(1:10)
two <- rnorm(10)
three <- runif(10, 1, 2)
four <- -10:-1

df <- data.frame(one, two, three, four)

vars <- c('on', 'thr')

I want to select only the columns in df whose titles start with'on' or 'thr':

dplyr::select_(df, starts_with(vars))

However, the above is not working.

Upvotes: 4

Views: 16529

Answers (3)

Quentin Perrier
Quentin Perrier

Reputation: 546

Here is a solution using starts_with:

df %>% 
  select(map(c('on', 'thr'), 
             starts_with, 
             vars = colnames(.)) %>% 
         unlist())

Basically, the idea is to apply the starts_with function to the vector of names by using map. But to get it to work, one must add the argument vars(the list of colnames), and then unlist the result of map to get the vector of positions.

This solution expands the one of Chrisss to the case where there are several matches for at least one entry.

Upvotes: 1

doctorG
doctorG

Reputation: 1731

Presumably you know in advance, because you're coding it in, what column name matches you want, so you could use

select(starts_with("on"), starts_with("thr"))

Ah, I see Tony Ladson essentiall suggested this already. Depending on your exact use case, though, I don't see a need to get them from a vector.

Upvotes: 4

Hong Ooi
Hong Ooi

Reputation: 57686

The various selection helper functions in dplyr are meant to take only a single character string for matching. You can get around this by combining your strings into one regular expression and using matches:

vars <- paste0("^(", paste(vars, collapse="|"), ")")
select(df, matches(vars))

Upvotes: 6

Related Questions