mac
mac

Reputation: 25

Repeat R script multiple times

I have an R script with hundreds of lines. This script eventually gives me a single numerical answer at the end. Now I want to create a confidence interval and hence run this whole script over and over multiple times to be able to calculate the mean and standard deviation. But I do not want to create a 'for' loop over the whole thing because that becomes really complicated

After some research, I came across this method:

My final answer is named as 'result' and then in a new script file,

result_list<-lapply(1:10, function(n)source("my_script_file.R"))
result_list

(repeating 10 times for example)

However the final results looks like this,

[[1]]
[[1]]$value
[1] 136.9876

[[1]]$visible
[1] TRUE

[[2]]
[[2]]$value
[1] 138.4969

[[2]]$visible
[1] TRUE

[[3]]
[[3]]$value
[1] 0.2356484

[[3]]$visible
[1] TRUE

. 
.

Now I have no idea what the second line means in every iterations? And how do I get the list of values, result_list$values doesn't work, while also ignoring the too small values that could be simulation error as like the 3rd one in here to be able to calculate the mean and sd.

Also Is there any other way to repeat this process except this method?

Upvotes: 2

Views: 11817

Answers (2)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

I would recommend making your script as a function, loading the function once, and then using replicate instead of lapply(1:n, ...).

Here's a very simple example:

Imagine you were working with a simple R script file that had the following contents:

## saved in working directory as "testfun.R"
myFun <- function(x, y, z) {
  mean(rnorm(x)) + mean(rnorm(y)) + mean(rnorm(z))
}

myFun(10, 12, 14)
## End of "testfun.R" file

Now, compare the timings of having to source 100 times with having to simply run the function 100 times:

fun1 <- function(n = 10) replicate(n, myFun(10, 12, 14))
fun2 <- function(n = 10) lapply(1:n, function(x) source("testfun.R")$value)

library(microbenchmark)
microbenchmark(fun1(100), fun2(100), unlist(fun2(100)), times = 1)
## Unit: milliseconds
##               expr       min        lq      mean    median        uq       max neval
##          fun1(100)  3.064384  3.064384  3.064384  3.064384  3.064384  3.064384     1
##          fun2(100) 59.635228 59.635228 59.635228 59.635228 59.635228 59.635228     1
##  unlist(fun2(100)) 61.349713 61.349713 61.349713 61.349713 61.349713 61.349713     1

I'm not sure how much of a difference it would make in the long run if more of the time is taken up in processing (rather than reading the source file), but I would still consider a function + replicate as a cleaner and easier-to-read alternative.

Upvotes: 2

akrun
akrun

Reputation: 887048

We can use $value to get the 'value' from each iteration

 lapply(1:10, function(n)source("my_script_file.R")$value)

As it is a single element, it may be also useful to use sapply to get a vector output

 v1 <- sapply(1:10, function(n)source("my_script_file.R")$value)

We can subset the vector for values greater than a particular threshold, for example 0.5,

 v1[v1 > 0.5] 

Upvotes: 0

Related Questions