Reputation: 121
I am trying to figure out a solution for managing a set of linux machines(OS:Ubuntu,~40 nodes. same hardware). These machines are supposed to be images of each other, softwareinstalled in one needs to be installed in other. My software requirements are hadoop, R and servicemix. R packages on all machines also need to be synchronized(package installed on one needs to be available in all the others)
One solution I am using right now is by using NFS and pssh. I am hoping there is a better/easier solution out there, which would make my life a bit easier. Any suggestion is appreciated.
Upvotes: 6
Views: 1794
Reputation: 368181
I have used a low-tech apporach in the past for this by simply sharing (at least parts of) /usr/local/
to keep a common R library in /usr/local/lib/R/site-library/
. I guess that could work for your Hadoop installation too.
I tried to keep the rest in Debian / Ubuntu packages and kept all nodes current. Local R and Ubuntu package repositories (for locally created packages) can also help, but are a bit more work.
Upvotes: 3
Reputation: 104020
Two popular choices are Puppet from Puppet Labs and Chef from OpsCode.
Another potential mechanism is creating a new metapackage that Requires:
the packages you want installed on all machines. When you modify your metapackage, an apt-get update && apt-get -u dist-upgrade
would install the new package on all your systems simultaneously.
The metapackage approach might be less work to configure and use initially, but Puppet or Chef might provide better returns on investment in the long run, as they can manage far more than just package installs.
Upvotes: 5