Reputation: 8154

Hadoop on cluster configuration /Installation

Hi i have a small doubt , I have started to use in my curiosity but now i have the following problem My scenario is like this - i have 10 machines connected in LAN and i need to create Name Node in one system and Data Nodes in remaining 9 machines . So do i need to install Hadoop on all the 10 machines ?

For example i have ( 1.. 10 ) machines , where machine1 is Server and from machine(2..9) are slaves[Data Nodes] so do i need to install hadoop on all 10 machines ?

And i have searched a lot On Hadoop cluster network on commodity machine but i dint get any thing related to Installation [ that is configuration]. Some of them given like how to config and install Hadoop on own system but not on the clustered environment

Can any one help me ? and give me the detailed idea or article suggested links to do the above process

Thanks

Upvotes: 0

Answers (3)

Krishna Kalyan

Reputation: 1702

Yes hadoop needs to be there on all the computers
For clustered Environment please go through the video

Upvotes: 1

user1261215

Reputation:

Please verify below tutorial

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

Hope it helps

Upvotes: 1

Remus Rusanu

Reputation: 294437

Yes, you need Hadoop installed in every node and each node should have the services started as for appropriate for its role. Also the configuration files, present on each node, have to coherently describe the topology of the cluster, including location/name/port for various common used resources (eg. namenode). Doing this manually, from scratch, is error prone, specially if you never did this before and you don't know exactly what you're trying to do. Also would be good to decide on a specific distribution of Hadoop (HortonWorks, Cloudera, HDInsight, Intel, etc)

I would recommend use one of the many deployment solutions out there. My favorite is Puppet, but I'm sure Chef will do too.

A different (perhaps better?) alternative is to use Ambari, which is a Hadoop specialized deployment and administering solution. See Deploying and Managing Hadoop Clusters with AMBARI.

Some Puppet resources to get you started: Using Vagrant, Puppet, Testing & Hadoop

Upvotes: 2

Hadoop on cluster configuration /Installation

Answers (3)

Related Questions