horatio1701d
horatio1701d

Reputation: 9159

Medium Hadoop / Spark Cluster Administration

Please let me know if this question is more appropriate for a different channel but I was wondering what the recommended tools are for being able to install, configure and deploy hadoop/spark across a large number of remote servers. I'm already familiar with how to setup all of the software but I'm trying to determine what I should start using that would allow me to easily deploy across a large number of servers. I've started to look into configuration management tools (ie. chef, puppet, ansible) but was wondering what the best and most user friendly option to start off with is out there. I also do not want to use spark-ec2. Should I be creating homegrown scripts to loop through a hosts file containing IP? Should I use pssh? pscp? etc. I want to just be able to ssh with as many servers as needed and install all of the software.

Upvotes: 0

Views: 85

Answers (2)

Joe Widen
Joe Widen

Reputation: 2448

Check out Apache Ambari. Its a great tool for central management of configs, adding new nodes, monitoring the cluster, etc. This would be your best bet.

Upvotes: 0

Hokam
Hokam

Reputation: 924

If you have some experience in scripting language then you can go for chef. The recipes are already available for deployment and configuration of cluster and it's very easy to start with.

And if wants to do it by your own then you can use sshxcute java API which runs the script on remote server. You can build up the commands there and pass them to sshxcute API to deploy the cluster.

Upvotes: 1

Related Questions