Reputation: 193
To explain my question and for your better understanding I will show you an example.
Let's say I have a data frame like this:
value <- c(1:1000)
group <- c(1:5)
df <- data.frame(value,group)
I have created my own function myfun() to get random rows out of the data frame df and store them in different data frames wz1 - wz5. After that my function will bind the data frames wz1 - wz5 into one data frame called wza and summarize the values by the group.
myfun <- function(){
wz1 <- df[sample(nrow(df), size = 300, replace = FALSE),]
wz2 <- df[sample(nrow(df), size = 10, replace = FALSE),]
wz3 <- df[sample(nrow(df), size = 100, replace = FALSE),]
wz4 <- df[sample(nrow(df), size = 40, replace = FALSE),]
wz5 <- df[sample(nrow(df), size = 50, replace = FALSE),]
wza <- rbind(wz1,wz2, wz3, wz4, wz5)
wza_sum <- aggregate(wza, by = list(group=wza$group), FUN = sum)
return(wza_sum)
}
Now I want to repeat my function myfun() 100 times with replicate().
dfx <- replicate(100,myfun(),simplify = FALSE)
The output is a list which contains 100 lists and each list is a data frame with 5 rows.
Here is a picture of what the output looks like in rstudio.
Now I want to calculate the arithmetic mean of the values of all the groups (1-5) of all the list (1-100). To explain this part a little bit better I will give you another example.
list[[1]] -> group 1 -> value = 53263
list[[2]] -> group 1 -> value = 51811
list[[3]] -> group 1 -> value = ...
list[[4]] -> group 1 -> value = ...
...
list[[100]] -> group 1 -> value = ...
-------
∑ / 100
list[[1]] -> group 2 -> value = 50748
list[[2]] -> group 2 -> value = 49165
list[[3]] -> group 2 -> value = ...
list[[4]] -> group 2 -> value = ...
...
list[[100]] -> group 2 -> value = ...
-------
∑ / 100
I want to calculate the arithmetic value of every group. Is there a way to achieve this?
Upvotes: 0
Views: 87
Reputation: 887881
The function can also be modified to make it simpler using sample_n
library(dplyr)
library(purrr)
myfun <- function(){map_dfr(c(300, 10, 100, 40, 50), ~
df %>%
sample_n(.x)) %>%
group_by(group) %>%
summarise(value = sum(value))
}
Now, we use rerun
from purrr
and then bind the rows as in the other solution
rerun(5, myfun()) %>%
bind_rows %>%
group_by(group) %>%
summarise(value = mean(value))
Upvotes: 2
Reputation: 1741
Here's a dplyr
solution that uses bind_rows()
to collapse dfx
into a single dataframe.
Note that I renamed your group
column as group_ID
within myfun()
. The dataframes within your original dfx
object had two separate columns both called group
.
library(dplyr)
value <- c(1:1000)
group <- c(1:5)
df <- data.frame(value, group)
myfun <- function(){
wz1 <- df[sample(nrow(df), size = 300, replace = FALSE),]
wz2 <- df[sample(nrow(df), size = 10, replace = FALSE),]
wz3 <- df[sample(nrow(df), size = 100, replace = FALSE),]
wz4 <- df[sample(nrow(df), size = 40, replace = FALSE),]
wz5 <- df[sample(nrow(df), size = 50, replace = FALSE),]
wza <- rbind(wz1,wz2, wz3, wz4, wz5)
wza_sum <- aggregate(wza, by = list(group_ID=wza$group), FUN = sum)
return(wza_sum)
}
dfx <- replicate(100,myfun(),simplify = FALSE)
dfx_df <- bind_rows(dfx) %>%
group_by(group_ID) %>%
summarize(group_mean = mean(value))
Result
> head(dfx_df)
# A tibble: 5 x 2
group_ID group_mean
<int> <dbl>
1 1 50064.
2 2 49806.
3 3 48814.
4 4 50051.
5 5 50972.
Upvotes: 2