alittleboy
alittleboy

Reputation: 10966

Row binding multiple datasets based results from one vector

In a simulation work, I have multiple datasets having the same column variables but different rows, and these datasets are simulated based on n, the sample size for each dataset. For example:

n.vec = c(5, 10, 15, 20)
data1 = data.frame(x=rnorm(n.vec[1], 0, 1), y=rnorm(n.vec[1], 0, 1))
data2 = data.frame(x=rnorm(n.vec[2], 1, 2), y=rnorm(n.vec[2], 1, 2))
data3 = data.frame(x=rnorm(n.vec[3], 3, 4), y=rnorm(n.vec[3], 3, 4))
data4 = data.frame(x=rnorm(n.vec[4], 5, 6), y=rnorm(n.vec[4], 5, 6))
mega.data = rbind(data1, data2, data3, data4)

Now without using a loop (n.vec may be of any length), is there an efficient way to row binding all the datasets into a single "mega dataset" like mega.data above? Preferably using dplyr package. Thanks.

Upvotes: 1

Views: 156

Answers (1)

alistaire
alistaire

Reputation: 43354

A simple way is to collect your parameters in a data frame, and then use purrr::pmap to iterate over its rows to produce the data frame you want. Its pmap_df variant will call dplyr::bind_rows on the results, binding them into a single data frame.

library(tidyverse)
set.seed(47)

mega_data <- data_frame(n = c(5, 10, 15, 20), 
                        mean = c(0, 1, 3, 5), 
                        sd = c(1, 2, 4, 6)) %>% 
    pmap_df(~list(x = rnorm(...), 
                  y = rnorm(...)))

mega_data
#> # A tibble: 50 x 2
#>         x       y
#>     <dbl>   <dbl>
#>  1  1.99  -1.09  
#>  2  0.711 -0.985 
#>  3  0.185  0.0151
#>  4 -0.282 -0.252 
#>  5  0.109 -1.47  
#>  6 -0.845 -2.13  
#>  7  1.08   1.50  
#>  8  1.99   0.319 
#>  9 -2.66   1.83  
#> 10  1.18   0.347 
#> # ... with 40 more rows

Upvotes: 2

Related Questions