user321627
user321627

Reputation: 2572

What is the best option on Amazon AWS to run R code in parallel that was designed for a Slurm manager?

I currently have R code that can be run through a Slurm manager with a shell and batch script. Essentially, my shell script creates 1000 job arrays, which then calls the batch script 1000 times.

I am wondering how I can take this set-up and what might be the most efficient way to transfer it to Amazon AWS. If not, what is the most effective way in Amazon AWS to run an R script multiple times and to take advantage of as many cores as possible? Is the RStudio Server a good option?

Any suggestions would be greatly appreciated. Thanks!

Upvotes: 5

Views: 757

Answers (1)

S.C
S.C

Reputation: 740

StarCluster may be a good choice:

StarCluster is an open source cluster-computing toolkit for Amazon’s Elastic Compute Cloud (EC2)

It is a part of The STAR program at MIT, that seeks to bridge the divide between scientific research and the classroom

You can easily deploy a cluster of any size and comprised of instances of your choice. NFS, MPI and OpenGrid resource manager will work out-of-the-box. You can also install SLURM on the cluster. With single commands you can boot or shutdown the cluster.

Simple commands to create and manage clusters are as follows:

* Create and Manage Clusters *

StarCluster allows easily creating one or more clusters of virtual machines in the cloud:

$ starcluster start -s 10 mycluster

Use the listclusters command to keep track of your clusters:

$ starcluster listclusters

Login to the master node of your cluster:

$ starcluster sshmaster mycluster

Add additional nodes to your cluster for more compute power:

$ starcluster addnode mycluster

Remove idle nodes from your cluster to minimize costs:

$ starcluster removenode mycluster node003

When you’re done using the cluster and wish to stop paying for it:

$ starcluster terminate mycluster

Upvotes: 2

Related Questions