EmpireJones
EmpireJones

Reputation: 3086

What is the role of 'cluster' software in relation to MPI?

I'm a little confused regarding how a cluster implementation ("Beowulf cluster") relates to a communication protocol such as MPI. What software components are needed to set up a "cluster" using something like OpenMPI?

Upvotes: 3

Views: 153

Answers (3)

Stefano Borini
Stefano Borini

Reputation: 143925

A cluster, as you know, is a bunch of computers networked together. When you have such configuration, you normally install and use the following:

  • MPI, for communication between processes
  • NFS, to have a network disk visible and shared to all nodes
  • NTP, to synchronize the time of the nodes so that you can compare log events and timestamps
  • bootp to boot the nodes from a remote node, so that each node restart fresh with a guaranteed good and uniform setup.
  • a set of cluster utilities to make your life easier, such as a distributed ssh to execute the same command on all nodes at the same time.
  • a task scheduler, or queue manager, such as Condor, LFS or others, that allow you to prioritize job submissions and eventually measure them for limiting/pricing.
  • a watchdog, so to reboot one node automatically if it gets stuck.
  • software control for UPS (so to shut down automatically in case of prolonged loss of power)

And much more. All this stuff is completely additional to MPI. MPI is just a communication channel between processes. MPI alone does not "make the cluster".

Upvotes: 3

Alex Reynolds
Alex Reynolds

Reputation: 96984

Take a look at Oracle Grid Engine (nee Sun Grid Engine or CODINE).

Upvotes: 0

carlpett
carlpett

Reputation: 12613

MPI, as you noted, will only provide communication between processes. If there will not be several people using the cluster, you really need nothing more (apart from some script to launch your program on all the nodes).

But, in reality we sadly seldom have our personal cluster. That's when you need a scheduler. The scheduler typically handles job submissions and resource allocation, possibly also taking care of prioritization, user management and other things to make your life easier.

Upvotes: 1

Related Questions