Reputation: 10996
Assuming the following data set:
df <- data.frame(...1 = c(1, 2, 3),
...2 = c(1, 2, 3),
n_column = c(1, 1, 2))
I now want to rename all vars that start with "...". My real data sets could have different numbers of "..." vars. The information about how many such vars I have is in the n_column
column, more precisely, it is the maximum of that column.
So I tried:
df %>%
rename_with(.cols = starts_with("..."),
.fn = paste0("new_name", 1:max(n_column)))
which gives an error:
# Error in paste0("new_name", 1:max(n_column)) :
# object 'n_column' not found
So I guess the problem is that the paste0 function does look for the column I provide within the current data set. Not sure, though, how I could do so. Any ideas?
I know I could bypass the whole thing by just creating an external scalar that contains the max. of n_column, but ideally I'd like to do everything in one pipeline.
Upvotes: 3
Views: 242
Reputation: 17648
A completly other approach would be
df %>% janitor::clean_names()
x1 x2 n_column
1 1 1 1
2 2 2 1
3 3 3 2
Upvotes: 1
Reputation: 887223
We can use str_c
library(dplyr)
library(stringr)
df %>%
rename_with(~str_c("new_name", seq_along(.)), starts_with("..."))
Or using base R
i1 <- startsWith(names(df), "...")
names(df)[i1] <- sub("...", "new_name", names(df)[i1], fixed = TRUE)
df
new_name1 new_name2 n_column
1 1 1 1
2 2 2 1
3 3 3 2
Upvotes: 1
Reputation: 389047
You don't need information from n_column
, .cols
will pass only those columns that satisfy the condition (starts_with("...")
).
library(dplyr)
df %>% rename_with(~paste0("new_name", seq_along(.)), starts_with("..."))
# new_name1 new_name2 n_column
#1 1 1 1
#2 2 2 1
#3 3 3 2
This is safer than using max(n_column)
as well, for example if the data from n_column
gets corrupted or the number of columns with ...
change this will still work.
A way to refer to column values in rename_with
would be to use anonymous function so that you can use .$n_column
.
df %>%
rename_with(function(x) paste0("new_name", 1:max(.$n_column)),
starts_with("..."))
I am assuming this is part of longer chain so you don't want to use max(df$n_column)
.
Upvotes: 3