R rename all columns with regex

Question

I messy column names that have the following format: column name is in English, followed by a slash(/), followed by a same word in French with the year. for example

CSD Code / Code de la SDR 2011, Education / Scolarité 2011, Labour Force Activity / Activité sur le marché du travail 2011

is there a tidyverse friendly solution that will let me rename all the columns by removing everything after the slash(/) but keep the year. for example: CSD Code 2011, Education 2011, Labour Force Activity 2011

MrFlick · Accepted Answer

You can use a regular expression. With the sample data:

x <- c("CSD Code / Code de la SDR 2011", 
        "Education / Scolarité 2011", 
        "Labour Force Activity / Activité sur le marché du travail 2011")

You can use the tidyverse package stringr and get

stringr::str_replace(x, " / \D*(?= \d+$)", "")
# [1] "CSD Code 2011"             
# [2] "Education 2011"            
# [3] "Labour Force Activity 2011"

The expression looks for a space and a slash and removes all the non-digit characters afterward leaving just the digits at the end.

You can use that with the dplyr::rename_with for column names

my_data %>% 
  rename_with(~stringr::str_replace(., " / \D*(?= \d+$)", ""))

R rename all columns with regex

Answers (2)

Related Questions