deschen
deschen

Reputation: 10996

tidyverse rename_with giving error when trying to provide new names based on existing column values

Assuming the following data set:

df <- data.frame(...1 = c(1, 2, 3),
                 ...2 = c(1, 2, 3),
                 n_column = c(1, 1, 2))

I now want to rename all vars that start with "...". My real data sets could have different numbers of "..." vars. The information about how many such vars I have is in the n_column column, more precisely, it is the maximum of that column.

So I tried:

df %>%
  rename_with(.cols = starts_with("..."),
              .fn   = paste0("new_name", 1:max(n_column)))

which gives an error:

# Error in paste0("new_name", 1:max(n_column)) : 
#   object 'n_column' not found

So I guess the problem is that the paste0 function does look for the column I provide within the current data set. Not sure, though, how I could do so. Any ideas?

I know I could bypass the whole thing by just creating an external scalar that contains the max. of n_column, but ideally I'd like to do everything in one pipeline.

Upvotes: 3

Views: 242

Answers (3)

Roman
Roman

Reputation: 17648

A completly other approach would be

df %>% janitor::clean_names()
  x1 x2 n_column
1  1  1        1
2  2  2        1
3  3  3        2

Upvotes: 1

akrun
akrun

Reputation: 887223

We can use str_c

library(dplyr)
library(stringr)
df %>% 
    rename_with(~str_c("new_name", seq_along(.)),  starts_with("..."))

Or using base R

i1 <- startsWith(names(df), "...")
names(df)[i1] <- sub("...", "new_name", names(df)[i1], fixed = TRUE)
df
  new_name1 new_name2 n_column
1         1         1        1
2         2         2        1
3         3         3        2

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389047

You don't need information from n_column, .cols will pass only those columns that satisfy the condition (starts_with("...")).

library(dplyr)

df %>% rename_with(~paste0("new_name", seq_along(.)),  starts_with("..."))

#  new_name1 new_name2 n_column
#1         1         1        1
#2         2         2        1
#3         3         3        2

This is safer than using max(n_column) as well, for example if the data from n_column gets corrupted or the number of columns with ... change this will still work.


A way to refer to column values in rename_with would be to use anonymous function so that you can use .$n_column.

df %>% 
     rename_with(function(x) paste0("new_name", 1:max(.$n_column)),
                 starts_with("..."))

I am assuming this is part of longer chain so you don't want to use max(df$n_column).

Upvotes: 3

Related Questions