Reputation: 1768
I have 20 data frames and in each of them I want to format the same column in the same way. Of course, I could make a list
of the dfs and then use lapply
. Instead, my goal is to modify the dfs such that in the end I do not have to access them as elements of a list but as dfs. Here is an example:
df1 <- data.frame(col1 = rnorm(5), col2 = rnorm(5))
df2 <- data.frame(col1 = rnorm(5), col2 = rnorm(5))
Now, suppose I want to add 1 to every value of col1
in df1
and df2
. Of course, I could do
df_list <- lapply(list(df1, df2), function(df) {
df$col1 <- df$col1 + 1
return(df)
})
But now df1
returns the original df instead of the modified one. How to do it?
Upvotes: 2
Views: 606
Reputation: 5456
You could avoid the function (and its temporary environment) with a loop like this:
df1 <- data.frame(col1 = 1:5, col2 = rnorm(5))
df2 <- data.frame(col1 = rep(0, 5), col2 = rnorm(5))
df1 # before
for (d in c("df1", "df2")) {
eval(parse(text = paste(d, "[['col1']] <- ", d, "[['col1']] + 1")))
}
df1 # after
Option 2:
df1 <- data.frame(col1 = 1:5, col2 = rnorm(5))
df2 <- data.frame(col1 = rep(0, 5), col2 = rnorm(5))
df1 # before
df2 # before
eval(parse(text = unlist(lapply(c("df1", "df2"), function(x) {
expr.dummy <- quote(df$col1 <- df$col1 +1) # df will be replaced by df1, df2
gsub("df", x, deparse(expr.dummy))
}))))
df1 # after
df2 # after
Upvotes: 1
Reputation: 47320
You could use a hack from @g-grothendieck in this question :
and do this:
list[df1, df2] <- lapply(list(df1, df2), function(df) {
df$col1 <- df$col1 + 1
return(df)
})
the hack
list <- structure(NA,class="result")
"[<-.result" <- function(x,...,value) {
args <- as.list(match.call())
args <- args[-c(1:2,length(args))]
length(value) <- length(args)
for(i in seq(along=args)) {
a <- args[[i]]
if(!missing(a)) eval.parent(substitute(a <- v,list(a=a,v=value[[i]])))
}
x
}
full code and results
df1 <- data.frame(col1 = rnorm(5), col2 = rnorm(5))
# col1 col2
# 1 -0.5451934 0.5043287
# 2 -1.4047701 -0.1184588
# 3 0.1745109 0.8279085
# 4 -0.5066673 -0.3269411
# 5 0.4838625 -0.3895784
df2 <- data.frame(col1 = rnorm(5), col2 = rnorm(5))
# col1 col2
# 1 0.4168078 -0.44654445
# 2 -1.9991098 -0.06179699
# 3 -1.0625996 1.21098946
# 4 0.4977718 0.45834008
# 5 -1.6181048 0.97917877
list[df1, df2] <- lapply(list(df1, df2), function(df) {
df$col1 <- df$col1 + 1
return(df)
})
# > df1
# col1 col2
# 1 0.4548066 0.5043287
# 2 -0.4047701 -0.1184588
# 3 1.1745109 0.8279085
# 4 0.4933327 -0.3269411
# 5 1.4838625 -0.3895784
# > df2
# col1 col2
# 1 1.41680778 -0.44654445
# 2 -0.99910976 -0.06179699
# 3 -0.06259959 1.21098946
# 4 1.49777179 0.45834008
# 5 -0.61810483 0.97917877
Upvotes: 1
Reputation: 887118
One option based on the OP's code would be to use list2env
after naming the list
elements
names(df_list) <- paste0("df", 1:2)
list2env(df_list, envir = .GlobalEnv)
If we need to avoid creating the list
(it is recommended to have a list
of datasets instead of creating individual objects in the global environment), then use assign
with for
loop
for(obj in paste0('df', 1:2)) {
assign(obj, `[<-`(get(obj), 'col1', value = get(obj)[['col1']] +1))
}
Upvotes: 3