user1224802
user1224802

Reputation: 11

How to replicate my simulation study

I am a beginner in R and I am doing a simulation study which I managed to produce one sample what I want to do. However, I have no idea how I should replicate what I did.

Here is the program I wrote so far:

I <- 500       # number of observations
J <- 18        # total number of items
K <- 6         # number of testlets
JK <-3         # number of items within a testlet
response <- matrix(0, I, J)  # null binary (0, 1) response matrix 
unit <- matrix(1, JK, 1)     # unit vector

set.seed(1234)

# Multidimensional 3-pl model
pij <- function(a,b,c,theta,gamma) {c+(1-c)*(1/(1+exp(-1.7*a*(theta-b-gamma))))}

# Assigning a and b parameter values
a <- c(.8,.9,.7,.8,.9,.7,.8,.9,.7,.8,.9,.7,.8,.9,.7,.8,.9,.7)
b <-c(1,0,-1.5,1,0,-1.5,1,0,-1.5,1,0,-1.5,1,0,-1.5,1,0,-1.5)
# Assigning c-parameter, each 3 items (c-parameter & testlet effect)
#(small&small, small&large, large&small, large&large, mixed&small, mixed&large)
c <- c(.2,.2,.2,.2,.2,.2,.5,.5,.5,.5,.5,.5,.2,.33,.5,.2,.33,.5)    

theta <- rnorm(I, 0, 1)   # random sampling theta-values from normal dist. M=0, SD=1

gamma1 <- rnorm(I, 0, .2)  # small testlet effect: random sampling gamma from normal dist. M=0, SD=.2
gamma2 <- rnorm(I, 0, 1)   # large testlet effect: random sampling gamma from normal dist. M=0, SD=1
gamma3 <- rnorm(I, 0, .2)  # small testlet effect: random sampling gamma from normal dist. M=0, SD=.2
gamma4 <- rnorm(I, 0, 1)   # large testlet effect: random sampling gamma from normal dist. M=0, SD=1
gamma5 <- rnorm(I, 0, .2)  # small testlet effect: random sampling gamma from normal dist. M=0, SD=.2
gamma6 <- rnorm(I, 0, 1)   # large testlet effect: random sampling gamma from normal dist. M=0, SD=1

# implementing that the testlet effect is same for the items within a testlet
gamma1T <- gamma1 %*% t(unit)
gamma2T <- gamma2 %*% t(unit)
gamma3T <- gamma3 %*% t(unit)
gamma4T <- gamma4 %*% t(unit)
gamma5T <- gamma5 %*% t(unit)
gamma6T <- gamma6 %*% t(unit)

gammaT <- matrix(c(gamma1T, gamma2T, gamma3T, gamma4T, gamma5T, gamma6T), I, J)  # getting all the gammas together in a large matrix

# Generating data using the information above
for(i in 1:I) {
  for(j in 1:J) {
    response[i, j] <- ifelse(pij(a=a[j], b=b[j], c=c[j], theta=theta[i], gamma=gammaT[i,j]) < runif(1), 0, 1)
  }
}

Thus, I get a data set "response". What I want to do is to replicate this and get say, 1000 "response" data sets. I think this can be done by replicating the random sampling for "theta" and "gamma", but I do not have an idea to actually do this.

Many, many thanks in advance,

Hanjoe.

Upvotes: 1

Views: 2501

Answers (2)

Roland Ewald
Roland Ewald

Reputation: 4614

The advice of Stedy is reasonable, apart from one thing: DON'T increment the seed in the for loop.

As fas as I understand Stedy's suggestion, set.seed(i) would be called within the for loop for each simulation, with i being incremented in each iteration. This strategy is known to introduce (potentially large) bias due to correlations among the generated sequences.

Instead, set the seed once at the beginning, i.e. before the for loop. For example, you could use the current Unix time as a seed, or read one from a file with random numbers (e.g. from random.org). Also, make sure to store the seed with your results, e.g. print it to a log file. If you want to reproduce the exact results of a previous set of replications again, you just have to set the corresponding seed.

If you want others to be able to exactly replicate your results, you should also specify the R version (and maybe the operating system) you used (as the RNG implementations may vary).

As an aside, simulation replication is an embarassingly parallel task, i.e. you can easily execute the replications in parallel if you have a multi-core machine (e.g. with rparallel). In this case, however, extra care regarding random numbers is necessary (e.g. see this paper).

Upvotes: 4

Stedy
Stedy

Reputation: 7469

I would take the local variables and make them into a function. Then make a for() loop, call the function and increment set.seed() by one each time the function is called for the length of the for() loop.

Upvotes: 1

Related Questions