Reputation: 1
I am working on merging multiple datasets from different sources. The column names from each dataset (as datframes) have different names and are in different orders. I have created a dictionary that contains all the different names and the common name I want to rename the original names with. How do I rename the original column names using the dictionary in R? I specifically want to use a dictionary because I may add more datasets (with different column names) in the future and it would be easy to adapt the dictionary.
I know I can manually rename every column but there are many (like 30) and they may change with the addition of new datasets.
df1 <- data.frame(site = c(1:6), code = c(rep("A",3), rep("B", 3)), result = c(20:25))
df2 <- data.frame(site_no = c(10:19), day = c(1:10), test = c(rep("A", 5), rep("B", 5)), value = c(1:10))
dict <- data.frame(oldName = c("site", "code", "result", "site_no", "day", "test", "value"), newName = c("site_number", "parameter", "result", "site_number", "day", "parameter", "result"))
I would like to rename the columns in df1 and df2 based on the dict dataframe, which contains the old names (all the column names from df1 and df2) and the new names (the common names to use).
The result would be:
colnames(df1)
"site_number" "parameter" "result"
colnames(df2)
"site_number" "day" "parameter" "result"
Upvotes: 0
Views: 999
Reputation: 887068
We can use rename_all
after placing the datasets in a list
. It is better to have those datasets in a list
instead of having them in the global environment
library(dplyr)
library(purrr)
out <- mget(ls(pattern = "^df\\d+$")) %>%
map(~ .x %>%
rename_all(~ as.character(dict$newName)[match(., dict$oldName)]))
If we want, we can can change the column names in the original object with list2env
list2env(out, .GlobalEnv)
names(df1)
#[1] "site_number" "parameter" "result"
names(df2)
#[1] "site_number" "day" "parameter" "result"
Upvotes: 1
Reputation: 79208
We can match
the names of the respective df
to the oldname, then extract the newname at the matched indices:
names(df1) = with(dict,newName[match(names(df1),oldName)])
names(df2) = with(dict,newName[match(names(df2),oldName)])
print(df1)
print(df2)
Upvotes: 3