Reputation: 2685
Im using a Vagrant provisioning script made by myself for installing a Cloudera cluster in my local VirtualBox provided. The provisioner is here:
https://github.com/theclue/cdh5-vagrant
Everything works ok in my local environment, but now I'm facing the problem on how to add the EC2 provider. Since the provisioner bake up a heavily manually-tuned Cloudera cluster, it make no sense to use Whirr for the task. I would try to stick on my beloved Vagrant.
The problem is the network. Each node of the cluster shares a private IP in the subnet 10.10.50.* - this assures that nodes can communicate between each others and cannot be accessed from outside.
Then, I hardcoded these private IPs on /etc/hosts file for each node. The file is the same for each node and it's something like:
10.10.50.5 cdh-master 10.10.50.6 cdh-node1 10.10.50.7 cdh-node2
In all hadoop configuration files edited during the provisioning phase, I used the FQHN.
In addition, the masternode has a second network interface bridged to my real LAN in DHCP, so with a public ip in the form of 192.168.1.* This is the door for the external world of my virtual cluster.
But when I build EC2 instances, I don't know neither the IPs nor the FQHNs in advance, and I don't think I can set up machines to have a private network interface.
Which is the best way to set uo the network naming in such conditions?
Upvotes: 0
Views: 399
Reputation: 13501
EC2 instances can check their own IP using the instance metadata service (curl -s http://169.254.165.254/latest/meta-data/local-ipv4
) and you can set up private network interfaces (ENIs). Another approach would be to use tags and AWS CLI to query your cluster configuration.
To avoid all this complexity, consider using AWS Elastic Map Reduce to provision your hadoop cluster.
see:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AESDG-chapter-instancedata.html http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html https://aws.amazon.com/elasticmapreduce/
Upvotes: 1