Joe
Joe

Reputation: 1768

How to modify multiple data frames without making a list of them and then using lapply?

I have 20 data frames and in each of them I want to format the same column in the same way. Of course, I could make a list of the dfs and then use lapply. Instead, my goal is to modify the dfs such that in the end I do not have to access them as elements of a list but as dfs. Here is an example:

df1 <- data.frame(col1 = rnorm(5), col2 = rnorm(5))
df2 <- data.frame(col1 = rnorm(5), col2 = rnorm(5))

Now, suppose I want to add 1 to every value of col1 in df1 and df2. Of course, I could do

df_list <- lapply(list(df1, df2), function(df) {
  df$col1 <- df$col1 + 1
  return(df)
})

But now df1 returns the original df instead of the modified one. How to do it?

Upvotes: 2

Views: 606

Answers (3)

r.user.05apr
r.user.05apr

Reputation: 5456

You could avoid the function (and its temporary environment) with a loop like this:

df1 <- data.frame(col1 = 1:5, col2 = rnorm(5))
df2 <- data.frame(col1 = rep(0, 5), col2 = rnorm(5))

df1 # before
for (d in c("df1", "df2")) {
  eval(parse(text = paste(d, "[['col1']] <- ", d, "[['col1']] + 1")))
}
df1 # after

Option 2:

df1 <- data.frame(col1 = 1:5, col2 = rnorm(5))
df2 <- data.frame(col1 = rep(0, 5), col2 = rnorm(5))

df1 # before
df2 # before
eval(parse(text = unlist(lapply(c("df1", "df2"), function(x) {
  expr.dummy <- quote(df$col1 <- df$col1 +1) # df will be replaced by df1, df2
  gsub("df", x, deparse(expr.dummy))
  }))))
df1 # after
df2 # after

Upvotes: 1

moodymudskipper
moodymudskipper

Reputation: 47320

You could use a hack from @g-grothendieck in this question :

http://stackoverflow.com/questions/1826519/how-to-assign-from-a-function-which-returns-more-than-one-value

and do this:

list[df1, df2] <- lapply(list(df1, df2), function(df) {
          df$col1 <- df$col1 + 1
          return(df)
        })

the hack

list <- structure(NA,class="result")
"[<-.result" <- function(x,...,value) {
  args <- as.list(match.call())
  args <- args[-c(1:2,length(args))]
  length(value) <- length(args)
  for(i in seq(along=args)) {
    a <- args[[i]]
    if(!missing(a)) eval.parent(substitute(a <- v,list(a=a,v=value[[i]])))
  }
  x
}

full code and results

df1 <- data.frame(col1 = rnorm(5), col2 = rnorm(5))
# col1       col2
# 1 -0.5451934  0.5043287
# 2 -1.4047701 -0.1184588
# 3  0.1745109  0.8279085
# 4 -0.5066673 -0.3269411
# 5  0.4838625 -0.3895784
df2 <- data.frame(col1 = rnorm(5), col2 = rnorm(5))
# col1        col2
# 1  0.4168078 -0.44654445
# 2 -1.9991098 -0.06179699
# 3 -1.0625996  1.21098946
# 4  0.4977718  0.45834008
# 5 -1.6181048  0.97917877

list[df1, df2] <- lapply(list(df1, df2), function(df) {
  df$col1 <- df$col1 + 1
  return(df)
})
# > df1
# col1       col2
# 1  0.4548066  0.5043287
# 2 -0.4047701 -0.1184588
# 3  1.1745109  0.8279085
# 4  0.4933327 -0.3269411
# 5  1.4838625 -0.3895784
# > df2
# col1        col2
# 1  1.41680778 -0.44654445
# 2 -0.99910976 -0.06179699
# 3 -0.06259959  1.21098946
# 4  1.49777179  0.45834008
# 5 -0.61810483  0.97917877

Upvotes: 1

akrun
akrun

Reputation: 887118

One option based on the OP's code would be to use list2env after naming the list elements

names(df_list) <- paste0("df", 1:2)
list2env(df_list, envir = .GlobalEnv)

If we need to avoid creating the list (it is recommended to have a list of datasets instead of creating individual objects in the global environment), then use assign with for loop

for(obj in paste0('df', 1:2)) {
     assign(obj, `[<-`(get(obj), 'col1', value = get(obj)[['col1']] +1))
 }

Upvotes: 3

Related Questions