SlavicDoomer
SlavicDoomer

Reputation: 181

R - mutate with regex in a loop

I have a data frame in which every column consists of number followed by text, e.g. 533 234r/r.

The following code to get rid off text works well:

  my_data <- my_data %>%
    mutate(column1 = str_extract(column1, '.+?(?=[a-z])'))

I would like to do it for multiple columns:

col_names <- names(my_data)
for (i in 1:length(col_names)) {
  my_data <- my_data%>%
    mutate(col_names[i] = str_extract(col_names[i], '.+?(?=[a-z])'))
}

But it returns an error:

Error: unexpected '=' in:
"  my_data <- my_data %>%
    mutate(col_names[i] ="

I think mutate_all() wouldn't work as well, bcos str_extract() requires column name as argument.

Upvotes: 1

Views: 176

Answers (1)

akrun
akrun

Reputation: 887088

If we are using strings, then convert to symbol and evaluate (!!) while we do the assignment with (:=)

library(dplyr)
library(stringr)
col_names <- names(my_data)
for (i in seq_along(col_names)) {
  my_data <- my_data   %>%
          mutate(!! col_names[i] := 
            str_extract(!!rlang::sym(col_names[i]), '.+?(?=[a-z])'))
       }

In tidyverse, we could do this with across instead of looping with a for loop (dplyr version >= 1.0)

my_data <- my_data %>%
      mutate(across(everything(), ~ str_extract(., '.+?(?=[a-z])')))

If the dplyr version is old, use mutate_all

my_data <- my_data %>%
          mutate_all(~ str_extract(., '.+?(?=[a-z])'))

Upvotes: 1

Related Questions