Reputation: 75
Objective: add the dataframe names as prefixes to some columns for a long list of dataframes
Problem: Using lapply or loops it seems like R doesn't pass the name of the dataframes to the method.
Data:
A<-data.frame(column_1=c(1,2,3),column_2=c(4,5,6),column_3=c(7,8,9),column_4=c(10,11,12))
B<-data.frame(column_1=c(13,14,15),column_2=c(16,17,18),column_3=c(19,20,21),column_4=c(22,23,24))
C<-data.frame(column_1=c(25,26,27),column_2=c(28,29,30),column_3=c(31,32,33),column_4=c(34,35,36))
list_of_dataframes<-list(A,B,C)
names(list_of_dataframes)<-c("A","B","C")
This is just an example. Actually my list of dataframes is quite long. So, it comes unhandy to add names manually with 'comment' like it is done here.
Desired Solution:
$A
A_column_1 A_column_2 A_column_3 column_4
1 1 4 7 10
2 2 5 8 11
3 3 6 9 12
$B
B_column_1 B_column_2 B_column_3 column_4
1 13 16 19 22
2 14 17 20 23
3 15 18 21 24
$C
C_column_1 C_column_2 C_column_3 column_4
1 25 28 31 34
2 26 29 32 35
3 27 30 33 36
As you can see the dataframe names are in the column names, except for column 4 which I want to exclude from this operation.
My Try:
The desired solution was actually produced by some code:
comment(list_of_dataframes$A) <- "A"
comment(list_of_dataframes$B) <- "B"
comment(list_of_dataframes$C) <- "C"
list_of_dataframes<-lapply(list_of_dataframes,function(dataframe){
a<-comment(dataframe)
colnames(dataframe)[c(1,2,3)]<-paste(a, colnames(dataframe)[c(1,2,3)], sep = "_")
return(dataframe)
}
)
list_of_dataframes
The problem with this solution is that I actually have a very long list of dataframes and I have many of such lists. So, I need to do all of this in an automated fashion. In the above code I use 'comment' where I separately type in the name of each dataframe. I need instead to automatically take the name of each dataframe. How can I do this?
I tried to use deparse(substitute(dataframe)) as here:
list_of_dataframes<-lapply(list_of_dataframes,function(dataframe){
a<-deparse(substitute(dataframe))
colnames(dataframe)<-paste(a, colnames(dataframe), sep = "_")
return(dataframe)
}
)
But, as you can see the names of the dataframe don't seem to be passed to lapply:
$A
X[[i]]_column_1 X[[i]]_column_2 X[[i]]_column_3 X[[i]]_column_4
1 1 4 7 10
2 2 5 8 11
3 3 6 9 12
$B
X[[i]]_column_1 X[[i]]_column_2 X[[i]]_column_3 X[[i]]_column_4
1 13 16 19 22
2 14 17 20 23
3 15 18 21 24
$C
X[[i]]_column_1 X[[i]]_column_2 X[[i]]_column_3 X[[i]]_column_4
1 25 28 31 34
2 26 29 32 35
3 27 30 33 36
Do you have any ideas how I can overcome this problem?
Upvotes: 3
Views: 120
Reputation: 887971
We can also use str_c
library(dplyr)
library(purrr)
imap(list_of_dataframes, ~ {
nm1 <- .y
.x %>% rename_at(1:3, ~ str_c(nm1, ., sep="_"))
})
Upvotes: 0
Reputation: 389325
In base R, you can do :
Map(function(x, y) {names(x)[1:3] = paste(y, names(x)[1:3], sep = "_");x},
list_of_dataframes, names(list_of_dataframes))
Or using imap
from purrr
library(dplyr)
purrr::imap(list_of_dataframes,
~.x %>% rename_at(1:3, function(x) paste(.y, x, sep = "_")))
#$A
# A_column_1 A_column_2 A_column_3 column_4
#1 1 4 7 10
#2 2 5 8 11
#3 3 6 9 12
#$B
# B_column_1 B_column_2 B_column_3 column_4
#1 13 16 19 22
#2 14 17 20 23
#3 15 18 21 24
#$C
# C_column_1 C_column_2 C_column_3 column_4
#1 25 28 31 34
#2 26 29 32 35
#3 27 30 33 36
Upvotes: 2