Hardikkumar Mistry
Hardikkumar Mistry

Reputation: 11

Hadoop Cluster - "hadoop" user ssh communication

I am setting up Hadoop 2.7.3 cluster on EC2 servers - 1 NameNode, 1 Secondary NameNode and 2 DataNodes.

Hadoop core uses SSH for communication with slaves to launch the processes on the slave node.

  1. Do we need to have same SSH keys on all the nodes for the hadoop user?
  2. What is the best practice/ideal way to copy or add the NameNode to Slave nodes SSH credentials?

Upvotes: 1

Views: 1736

Answers (1)

Petro
Petro

Reputation: 3652

Do we need to have same SSH keys on all the nodes for the hadoop user?

  • The same public key needs to be on all of the nodes

What is the best practice/ideal way to copy or add the NameNode to Slave nodes SSH credentials?

Per documentation:

Namenode: Password Less SSH

Password-less SSH between the name nodes and the data nodes. Let us create a public-private key pair for this purpose on the namenode.

namenode> ssh-keygen

Use the default (/home/ubuntu/.ssh/id_rsa) for the key location and hit enter for an empty passphrase.

Datanodes: Setup Public Key

The public key is saved in /home/ubuntu/.ssh/id_rsa.pub. We need to copy this file from the namenode to each data node and append the contents to /home/ubuntu/.ssh/authorized_keys on each data node.

datanode1> cat id_rsa.pub >> ~/.ssh/authorized_keys
datanode2> cat id_rsa.pub >> ~/.ssh/authorized_keys
datanode3> cat id_rsa.pub >> ~/.ssh/authorized_keys

Namenode: Setup SSH Config

SSH uses a configuration file located at ~/.ssh/config for various parameters. Set it up as shown below. Again, substitute each node’s Public DNS for the HostName parameter (for example, replace with EC2 Public DNS for NameNode).

Host nnode
  HostName <nnode>
  User ubuntu
  IdentityFile ~/.ssh/id_rsa

Host dnode1
  HostName <dnode1>
  User ubuntu
  IdentityFile ~/.ssh/id_rsa

Host dnode2
  HostName <dnode2>
  User ubuntu
  IdentityFile ~/.ssh/id_rsa

Host dnode3
  HostName <dnode3>
  User ubuntu
  IdentityFile ~/.ssh/id_rsa

At this point, verify that password-less operation works on each node as follows (the first time, you will get a warning that the host is unknown and whether you want to connect to it. Type yes and hit enter. This step is needed once only):

namenode> ssh nnode
namenode> ssh dnode1
namenode> ssh dnode2
namenode> ssh dnode3

Upvotes: 1

Related Questions