smschauhan
smschauhan

Reputation: 121

Ubuntu cluster management

I am trying to figure out a solution for managing a set of linux machines(OS:Ubuntu,~40 nodes. same hardware). These machines are supposed to be images of each other, softwareinstalled in one needs to be installed in other. My software requirements are hadoop, R and servicemix. R packages on all machines also need to be synchronized(package installed on one needs to be available in all the others)

One solution I am using right now is by using NFS and pssh. I am hoping there is a better/easier solution out there, which would make my life a bit easier. Any suggestion is appreciated.

Upvotes: 6

Views: 1794

Answers (2)

Dirk is no longer here
Dirk is no longer here

Reputation: 368181

I have used a low-tech apporach in the past for this by simply sharing (at least parts of) /usr/local/ to keep a common R library in /usr/local/lib/R/site-library/. I guess that could work for your Hadoop installation too.

I tried to keep the rest in Debian / Ubuntu packages and kept all nodes current. Local R and Ubuntu package repositories (for locally created packages) can also help, but are a bit more work.

Upvotes: 3

sarnold
sarnold

Reputation: 104020

Two popular choices are Puppet from Puppet Labs and Chef from OpsCode.

Another potential mechanism is creating a new metapackage that Requires: the packages you want installed on all machines. When you modify your metapackage, an apt-get update && apt-get -u dist-upgrade would install the new package on all your systems simultaneously.

The metapackage approach might be less work to configure and use initially, but Puppet or Chef might provide better returns on investment in the long run, as they can manage far more than just package installs.

Upvotes: 5

Related Questions