Is there a way to get the arithmetic mean of the values in a list which contains many lists?

Question

To explain my question and for your better understanding I will show you an example.

Let's say I have a data frame like this:

value <- c(1:1000)
group <- c(1:5)
df <- data.frame(value,group)

I have created my own function myfun() to get random rows out of the data frame df and store them in different data frames wz1 - wz5. After that my function will bind the data frames wz1 - wz5 into one data frame called wza and summarize the values by the group.

myfun <- function(){
  wz1 <- df[sample(nrow(df), size = 300, replace = FALSE),]
  wz2 <- df[sample(nrow(df), size = 10, replace = FALSE),]
  wz3 <- df[sample(nrow(df), size = 100, replace = FALSE),]
  wz4 <- df[sample(nrow(df), size = 40, replace = FALSE),]
  wz5 <- df[sample(nrow(df), size = 50, replace = FALSE),]

  wza <- rbind(wz1,wz2, wz3, wz4, wz5)
  wza_sum <- aggregate(wza, by = list(group=wza$group), FUN = sum)
  return(wza_sum)
}

Now I want to repeat my function myfun() 100 times with replicate().

dfx <- replicate(100,myfun(),simplify = FALSE)

The output is a list which contains 100 lists and each list is a data frame with 5 rows.

Here is a picture of what the output looks like in rstudio.

Now I want to calculate the arithmetic mean of the values of all the groups (1-5) of all the list (1-100). To explain this part a little bit better I will give you another example.

list[[1]] -> group 1 -> value =   53263 
list[[2]] -> group 1 -> value =   51811
list[[3]] -> group 1 -> value =   ...
list[[4]] -> group 1 -> value =   ...
...
list[[100]] -> group 1 -> value = ...
                               -------
                                ∑ / 100



list[[1]] -> group 2 -> value =   50748 
list[[2]] -> group 2 -> value =   49165
list[[3]] -> group 2 -> value =   ...
list[[4]] -> group 2 -> value =   ...
...
list[[100]] -> group 2 -> value = ...
                               -------
                                ∑ / 100

I want to calculate the arithmetic value of every group. Is there a way to achieve this?

Eugene Chong · Accepted Answer

Here's a dplyr solution that uses bind_rows() to collapse dfx into a single dataframe.

Note that I renamed your group column as group_ID within myfun(). The dataframes within your original dfx object had two separate columns both called group.

library(dplyr)

value <- c(1:1000)
group <- c(1:5)
df <- data.frame(value, group)

myfun <- function(){
  wz1 <- df[sample(nrow(df), size = 300, replace = FALSE),]
  wz2 <- df[sample(nrow(df), size = 10, replace = FALSE),]
  wz3 <- df[sample(nrow(df), size = 100, replace = FALSE),]
  wz4 <- df[sample(nrow(df), size = 40, replace = FALSE),]
  wz5 <- df[sample(nrow(df), size = 50, replace = FALSE),]

  wza <- rbind(wz1,wz2, wz3, wz4, wz5)
  wza_sum <- aggregate(wza, by = list(group_ID=wza$group), FUN = sum)
  return(wza_sum)
}

dfx <- replicate(100,myfun(),simplify = FALSE)

dfx_df <- bind_rows(dfx) %>% 
  group_by(group_ID) %>% 
  summarize(group_mean = mean(value))

Result

> head(dfx_df)
# A tibble: 5 x 2
  group_ID group_mean
           
1        1     50064.
2        2     49806.
3        3     48814.
4        4     50051.
5        5     50972.

Is there a way to get the arithmetic mean of the values in a list which contains many lists?

Answers (2)

Related Questions