viktor_r
viktor_r

Reputation: 721

Transform columns to factor in data frame with dplyr and regular expression

I have a data.frame with >100 columns, which are all formated as numeric after importing them. I'd like to transform specific columns from numeric to factor. Instead of transforming each column manually I'd like to select the relevant columns using regular expression for the column names and transform them. With the help of regexr.com I created the following expression: \b\w{2,4}[1-9]\b. It is supposed to select all columns where the column name is a word with 2 to 4 letters, ending with a number from 1 to 9.

Here's an example:

df<-data.frame(pre1=c(1:10), 
               em2=c(1:10), 
               foo=c(1:10))
df
   pre1 em2 foo
1     1   1   1
2     2   2   2
3     3   3   3
4     4   4   4
5     5   5   5
6     6   6   6
7     7   7   7
8     8   8   8
9     9   9   9
10   10  10  10

df %>%
select(matches("/\b\w{2,4]}[1-9]\b/"))
Error: '\w' is an unrecognized escape in character string starting ""/\b\w"

This should select the first two columns, but not the third. It seems that \w is not recognized by matches. Is there any other way to do it?

Upvotes: 1

Views: 5748

Answers (1)

Julia Silge
Julia Silge

Reputation: 11603

You can do this all in one go quite nicely with dplyr::mutate_at(), defining the variables you want to change to factor with vars().

library(dplyr)

df <- data_frame(pre1=c(1:10), 
                 em2=c(1:10), 
                 foo=c(1:10))

df %>%
  mutate_at(vars(matches("\\b\\w{2,4}[1-9]\\b")), as.factor)

#> # A tibble: 10 x 3
#>      pre1    em2   foo
#>    <fctr> <fctr> <int>
#>  1      1      1     1
#>  2      2      2     2
#>  3      3      3     3
#>  4      4      4     4
#>  5      5      5     5
#>  6      6      6     6
#>  7      7      7     7
#>  8      8      8     8
#>  9      9      9     9
#> 10     10     10    10

Upvotes: 8

Related Questions