Jake
Jake

Reputation: 474

for loop iterating over a list of data frames

I am trying to write a for loop that first checks if it is a data frame. If it is a data frame then iterates over the data frames and calculates the mean and then creates a new column with the mean value. Here is an example:

df1 <- data.frame(
  Number = c(45,62,27,34,37,55,40))
df2 <- data.frame(
  Number = c(15,20,32,21,17,18,13))
df3 <- data.frame(
  Number = c(12,32,22,14,16,21,30))

L <- list(df1,df2,df3)

for(i in L){if(is.data.frame(i)){
  i$Average <- mean(i)
}
}

and an example of the result I am after for df1 would be:

 Number  Average
1     45 42.85714
2     62 42.85714
3     27 42.85714
4     34 42.85714
5     37 42.85714
6     55 42.85714
7     40 42.85714

Thanks!

Upvotes: 0

Views: 77

Answers (3)

apax
apax

Reputation: 160

i will only be a temporary object used to control your for loop. To make changes to the dataframes stored in L outside of the loop try indexing by number like this.

df1 <- data.frame(Number = c(45,62,27,34,37,55,40))
df2 <- data.frame(Number = c(15,20,32,21,17,18,13))
df3 <- data.frame(Number = c(12,32,22,14,16,21,30))

L <- list(df1,df2,df3)

for(i in 1:length(L)){if(is.data.frame(L[[i]])){

## Requires explicitly extracting the values in 
## L[[i]] by name.  So could be problematic if you actually
## have many columns in your dataframes.  
L[[i]]$Average <- mean(L[[i]]$Number)
}
}

Upvotes: 1

akrun
akrun

Reputation: 887118

If we need to update the original data.frame objects with the new value, then use assign

nm1 <- paste0("df", 1:3)
for(i in seq_along(L)) {
    assign(nm1[i], `[<-`(L[[i]], "Average", value = mean(L[[i]]$Number)))
 }   

df1
#  Number  Average
#1     45 42.85714
#2     62 42.85714
#3     27 42.85714
#4     34 42.85714
#5     37 42.85714
#6     55 42.85714
#7     40 42.85714

Regarding why the OP's loop didn't work,

for(i in L) print(i)

returns the value of the list and not the names of the objects. So, we cannot an assignment i$Average <-. The list elements don't have names. Also, mean works on a vector. It can be directly applied on data.frame

mean(L[[1]])
#[1] NA

Warning message: In mean.default(L[[1]]) : argument is not numeric or logical: returning NA

mean(L[[1]]$Number)
#[1] 42.85714

In the for loop, it means we get NAs

for(i in L) mean(i)
#  Warning messages:
#1: In mean.default(i) : argument is not numeric or logical: returning NA
#2: In mean.default(i) : argument is not numeric or logical: returning NA
#3: In mean.default(i) : argument is not numeric or logical: returning NA

Once, we extract the column 'Number', the mean works

for(i in L) print(mean(i$Number))
#[1] 42.85714
#[1] 19.42857
#[1] 21

But, it is easier to keep it in the list and update the datasets in the list. Use lapply to create a column 'Average' by looping over the list and getting the mean of the 'Number'

lapply(L, transform, Average = mean(Number))

Or with tidyverse

library(tidyverse)
L %>%
   map(~ .x %>%
            mutate(Average = mean(Number)))

Upvotes: 2

Anonymous
Anonymous

Reputation: 171

You may use purrr::map to do this:

require(purrr)

L %>%
  map(
    .f =
      ~ .x %>%
      {
        if (is.data.frame(.)) {
          mutate(., Average = mean(Number))
        }
      }
  )

# [[1]]
#   Number  Average
# 1     45 42.85714
# 2     62 42.85714
# 3     27 42.85714
# 4     34 42.85714
# 5     37 42.85714
# 6     55 42.85714
# 7     40 42.85714
# 
# [[2]]
#   Number  Average
# 1     15 19.42857
# 2     20 19.42857
# 3     32 19.42857
# 4     21 19.42857
# 5     17 19.42857
# 6     18 19.42857
# 7     13 19.42857

# [[3]]
#   Number Average
# 1     12      21
# 2     32      21
# 3     22      21
# 4     14      21
# 5     16      21
# 6     21      21
# 7     30      21

Upvotes: 1

Related Questions