Reputation: 1041
I want to create a function which renames specific values in a column to something else, which is specified by the function, something like this (although in reality there would be much more to rename):
func <- function(x) x %>%
mutate(col_name = ifelse(col_name =="something","something else",
ifelse(col_name == "something2","something_else2")))
Note that it isn't the column names that I want to change, it is the values themselves in the column. However, I would like this to work regardless of which column the values are in (e.g. the function works all over the data frame). Also, this only works if the values named in the function is present, and I would like it to ignore the ones that aren't present in the columns. here is a small reproducible example: (column values are arbitrary)
col1 <- c("a","b","c","d","e")
col2 <- c("b","f","d","c","g")
df <- data.frame(col1, col2)
col3 <- c("a","h","i","b","c")
col4 <- c("c","d","j","a","g")
df2 <- data.frame(col3, col4)
Which looks like this:
df1:
col1 col2
1 a b
2 b f
3 c d
4 d c
5 e g
df2:
col3 col4
1 a c
2 h d
3 i j
4 b a
5 c g
Say that i want to rename like this:
df1:
col1 col2
1 can chi
2 chi pig
3 equ she
4 she equ
5 fox bov
df2:
col3 col4
1 can equ
2 avi she
3 tyr asp
4 chi can
5 equ bov
So what I was hoping to get was a function that changes the names of multiple values in data frame columns regardless of its position in the data frame, and that it ignores the values not found in the data frame by the function.
Upvotes: 1
Views: 1358
Reputation: 18681
library(dplyr)
func = function(x, originals = letters[1:10],
rename_tos = c("can", "chi", "equ", "she", "fox", "pig", "bov", "avi", "tyr", "asp")){
names(rename_tos) = originals
x %>%
mutate_if(is.factor, as.character) %>%
lapply(function(y){
y = rename_tos[y]
}) %>%
data.frame(row.names = NULL)
}
Results:
> func(df)
col1 col2
1 can chi
2 chi pig
3 equ she
4 she equ
5 fox bov
> func(df2)
col3 col4
1 can equ
2 avi she
3 tyr asp
4 chi can
5 equ bov
Notes:
The method I used is basically to create a lookup table (named vector) for the renames and index the rename_tos
vector with column values. Here, I've set the originals and renames as the default of the function, but you can also supply your own.
If you want to be able to rename columns specified and leave the other columns the same, you can do something like the following:
library(dplyr)
library(rlang)
func = function(x, ..., originals = letters[1:10],
rename_tos = c("can", "chi", "equ", "she", "fox", "pig", "bov", "avi", "tyr", "asp")){
names(rename_tos) = originals
dots = quos(...)
x %>%
mutate_at(vars(!!! dots), as.character) %>%
mutate_at(vars(!!! dots), funs(rename_tos[.])) %>%
data.frame(row.names = NULL)
}
Result:
> func(df, col2)
col1 col2
1 a chi
2 b pig
3 c she
4 d equ
5 e bov
> func(df2, col3, col4)
col3 col4
1 can equ
2 avi she
3 tyr asp
4 chi can
5 equ bov
> func(df2, c(col3, col4))
col3 col4
1 can equ
2 avi she
3 tyr asp
4 chi can
5 equ bov
Notes:
Here, I added the ...
argument to allow the user to input their own column names. I used quos
from rlang
to quote the ...
arguments and later unquoted them inside vars
to mutate_at
using !!!
. For example, if the user supplied func(df, col2)
, the first argument of mutate_at
evaluates to vars(col2)
. This works with multiple arguments as well as a vector of arguments as one can see in the results.
Upvotes: 1