Zaid
Zaid

Reputation: 37146

How can I use the Platform LSF blaunch command to start processes simultaneously?

I'm having a hard time figuring out why I can't launch commands in parallel using the LSF blaunch command:

for num in `seq 3`; do
blaunch -u JobHost ./cmd_${num}.sh &
done

Error message:

Oct 29 13:08:55 2011 18887 3 7.04 lsb_launch(): Failed while executing tasks.
Oct 29 13:08:55 2011 18885 3 7.04 lsb_launch(): Failed while executing tasks.
Oct 29 13:08:55 2011 18884 3 7.04 lsb_launch(): Failed while executing tasks.

Removing the ampersand (&) allows the commands to execute sequentially, but I am after parallel execution.

Upvotes: 1

Views: 2159

Answers (3)

Squirrel
Squirrel

Reputation: 2282

When executed within the context of bsub, a single invocation of blaunch -u <hostfile> <cmd> will take <cmd> and run it on all the hosts specified in <hostfile> in parallel as long as those hosts are within the job's allocation.

What you're trying to do is use 3 separate invocations of blaunch to run 3 separate commands. I can't find it in the documentation, but just some testing on a recent version of LSF shows that each individually executed task in such a job has a unique task ID stored for it in an environment variable called LSF_PM_TASKID. You can verify this in your version of LSF by running something like:

blaunch -I -n <num_tasks> blaunch env | grep TASKID

Now, what does this have to do with your question? You want to run ./cmd_$i.sh for i=1,2,3 in parallel through blaunch. To do this you can write a single script which I'll call cmd.sh as follows:

#!/bin/sh
./cmd_${LSF_PM_TASKID}.sh

Now you can replace your for loop with a single invocation of blaunch like so:

blaunch -u JobHost cmd.sh

This will run one instance of cmd.sh on each host listed in the file 'JobHost' in parallel, each of these instances will run the shell script cmd_X.sh where X is the value of $LSF_PM_TASKID for that particular task.

If there's exactly 3 hostnames in 'JobHost' then you will get 3 instances of cmd.sh which will in turn lead to one instance each of cmd_1.sh, cmd_2.sh, and cmd_3.sh

Upvotes: 1

Greg Shively
Greg Shively

Reputation: 1

blaunch is not to be used outside of the job execution environment provided by bsub. I don't know how to handle running different commands for each process, but try something like:

bsub -n 3 blaunch ./cmd.sh

Upvotes: 0

0xd
0xd

Reputation: 1901

Have you tried nohup? This might work:

for num in `seq 3`; do
nohup blaunch -u JobHost ./cmd_${num}.sh &>/dev/null &
done

Upvotes: 0

Related Questions