Ben Z
Ben Z

Reputation: 105

Using R code to simulate data iteratively in Proc IML and then analyze in SAS procedure, a faster way?

The following codes are what I thought of, it is kind of slow, any suggestions? Thank you!

The details are that first create a dataset in proc iml using R code, then transport that into regular SAS proc mixed statement to analyze it, then use proc append to store the results, then iterate the process 10000 times.

proc iml;
  do i= 1 to 100000;  
    submit / R;
       library(mvtnorm)
       library(dplyr)
       library(tidyr)
       beta <- matrix(1:50, byrow = TRUE,10,5)
       sigma <- matrix(1:25, 5)
       sigma [lower.tri(sigma )] = t(sigma )[lower.tri(sigma )]
       sample <- t(apply(beta, 1, function(m) rmvnorm(1, mean=m, sigma=sigma)))
       Group = rep(factor(LETTERS[1:2]),each=5,1)
       sample <- cbind(sample,Group,c(1:5))
       concat <- function(x) paste0('Visit', x[, 2], 'Time', x[, 1])
       cnames <- c(paste0("Time", 1:5),"Group","ID")
       colnames(sample) <- cnames
       sample <- data.frame(sample)
       sample <- gather(sample, Visit, Response, paste0("Time", 1:5), factor_key=TRUE)
    endsubmit;

    call ImportDataSetFromR( "rdata", "sample" );

    submit;
       Proc mixed data=rdata;
          ods select none;
          class Group Visit ID;
          model Response = Visit|Group;
          repeated  Visit/ subject=ID type=un;
          ods output  Tests3=Test;
       run;
       proc append data=Test base=result force ;
       run;
    ENDSUBMIT;
  end;
Quit;
proc print data=result;
run;

Upvotes: 0

Views: 654

Answers (2)

Rick
Rick

Reputation: 1210

The ideal approach would be to do the full simulation in SAS/IML because that would minimize the transfer of data between SAS and R. You can use the RANDNORMAL function to simulate multivariate normal data. Use the CREATE/APPEND statements to save the simulated samples to a SAS data set. Then call PROC MIXED and use a BY statement to analyze all the samples. See "Simulation in SAS," for the general ideas. No SUBMIT blocks are required. If you experience programming issues, consult the "Simulation" posts on The DO Loop blog, or if you intend to do a lot of simulation in SAS, you might want to find a copy of Simulating Data with SAS (Wicklin, 2013)

If you don't know SAS/IML well enough to run the simulation, then generate all 100,000 samples in R (vectorize, if possible) and manufacture a SampleID variable to identify each sample. Then import the entire data into SAS and use the BY statement trick to do the analysis.

Upvotes: 3

DomPazz
DomPazz

Reputation: 12465

Don't know exactly what you are doing so this has to be general.

Move the loop inside of the R code. Stay inside R to generate 1 big data frame and then import that into SAS. Looping over those submits will be slower. There is necessary overhead to call R, import the data from R (which is another R call), and then to run your SAS append. Putting the loop into R eliminates that overhead.

Upvotes: 1

Related Questions