Reputation: 141
I am trying to utilize parallel nodes to run numerical simulations. I have Nodes #0 though 12 and I wish to utilize them each individually to run a separate part of the simulation. Essentially, I need to evaluate f(x) for x=1 through 4 on one node, then f(x) for x=5 through 9 on the next node, and then f(x) for x = 10 through 14 one the next one, and then so on from there. Initially, I tried using a loop like:
n=0
while [ $n -le 12 ]
do
ssh compute-0-$n
#evaluate the f(x) for the x values that I want
exit
n=$(($n+1))
done
But this did not work because whenever I used the ssh compute-0-$n command to jump to a node the connection to the original shell script seemed to cease, when I would exit the node, the shell script seemed to continue along its merry way... I suppose there is a better way to accomplish this, but I am relatively new to this, can anyone help?
Upvotes: 2
Views: 1536
Reputation: 33685
GNU Parallel is made for exactly this kind of tasks.
evaluate_f() {
x="$1"
# do some crazy computation
}
seq 48 | env_parallel --env evaluate_f -Snode{1..12} evaluate_f {}
If the machines are not really called node1 .. node12, then is becomes a bit longer:
seq 48 | env_parallel --env evaluate_f -Snode1,nodeb,nodeIII,node0100,node0x5,node6,nodeg,nodeVIII,node01001,node0xa,node11,nodel evaluate_f {}
If you have the nodes in a file:
seq 48 | env_parallel --env evaluate_f --slf my_nodefile evaluate_f {}
What this does is to copy the function evaluate_f
to the remote servers and run it there with one argument from seq 48
. By default it will run one job per cpu-core in the servers. This makes sense if your computation is not multithreaded and does not have a lot of disk I/O. This can be changed with --jobs.
env_parallel
was introduced in version 20160322, so make sure your version is newer than that.
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
Installation
You should install GNU Parallel with your package manager, but if GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
See more examples: http://www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel
Upvotes: 1
Reputation: 2648
The first thing to understand is that when you run ssh (without the &), ssh itself runs until completion. It opens up a new shell on the remote host, and reads commands -- but not the commands from the script that launched it. The ssh session has no knowledge of the script that launched it; it's waiting for commands from stdin.
You need to do three things:
ssh compute-0-$n docompute.sh &
. The &
will get you
the parallelism you want, by running the ssh process in the
background.See running same script over many machines for discussion of something quite similar. The use of & to run the command in the background is key there.
Upvotes: 1
Reputation: 2894
If in ubuntu, you could use odp program.
this program utilize the parallel ssh to run command simultaneously. user only needed to write their data center configuration and scripts into a config file, then use this program to parallel execute.
here is url: http://sourceforge.net/projects/odp/
Upvotes: 0