Tendero
Tendero

Reputation: 1166

How to efficiently do a sweep of many parameters to run a script in R?

I have a script in R and it has 4 parameters that can be varied. I want to run a sweep with a few combinations of these parameters, and time these runs to compare them afterwards. Something like this:

dim_map = c(10,40,80,120)
epochs = c(200,600,1000)
dim_input = c(3,80,400,1000,3000)
datapoints = c(15000,50000,100000)
results = data.frame(dim_map = c(),
                     epochs = c(),
                     dim_input = c(),
                     datapoints = c(),
                     time = c()
)

for(dim in dim_map){
  for (epoch in epochs){
    for (m in dim_input){
      for (n in datapoints){

        t = Sys.time() # Start time

        ## Run some script

        elapsed_time = as.numeric(Sys.time() - t, units = 'secs')

        results[nrow(results)+1,] = c(dim, epoch, m, n, elapsed_time)
      }
    }
  }
}

Is there a way to do this avoiding loops? I feel like these nested loops are slowing down the sweep, but I don't know if this is just my imagination. Or maybe a better way to time the script with these parameter variation?

Upvotes: 0

Views: 198

Answers (1)

Calum You
Calum You

Reputation: 15072

I think one of the easiest ways to do this kind of thing is to combine pmap and cross_df from purrr. We can easily create all the combinations of parameters and then run our code for each of them, storing the results in a new column:

library(tidyverse)

params <-  cross_df(list(
  dim_map = c(10,40,80,120),
  epochs = c(200,600,1000),
  dim_input = c(3,80,400,1000,3000),
  datapoints = c(15000,50000,100000)
))

timer <- function(dim_map, epochs, dim_input, datapoints){
  start_time = Sys.time()
  Sys.sleep(0.01) # your code to time here
  end_time = Sys.time()

  return(end_time - start_time)
}

params %>%
  mutate(time = pmap_dbl(., timer))
#> # A tibble: 180 x 5
#>    dim_map epochs dim_input datapoints   time
#>      <dbl>  <dbl>     <dbl>      <dbl>  <dbl>
#>  1      10    200         3      15000 0.0110
#>  2      40    200         3      15000 0.0110
#>  3      80    200         3      15000 0.0110
#>  4     120    200         3      15000 0.0110
#>  5      10    600         3      15000 0.0110
#>  6      40    600         3      15000 0.0110
#>  7      80    600         3      15000 0.0110
#>  8     120    600         3      15000 0.0109
#>  9      10   1000         3      15000 0.0110
#> 10      40   1000         3      15000 0.0110
#> # ... with 170 more rows

Created on 2018-09-21 by the reprex package (v0.2.0).

Upvotes: 1

Related Questions