kuldeep kulkarni
kuldeep kulkarni

Reputation: 1

Need help to setup hadoop cluster in aws

I would like to setup a hadoop cluster in aws which will have total capacity of 100T approx. If I go and choose aws instances as per http://aws.amazon.com/ec2/instance-types/ , I do not get ideal configuration for data nodes, I would like to use local disks(SSD/NON-SSD) for worker nodes. for e.g. If I select cc2.8xlarge instance for datanode then for 100T I will have to setup 30 cc2.8xlarge instances which would be very costly. Could you please suggest how should I configure my cluster in aws (EC2) with minimum number of datanodes or is there any standard configuration for hadoop in aws ?

Upvotes: 0

Views: 80

Answers (2)

BraveNewCurrency
BraveNewCurrency

Reputation: 13065

If you want to do Hadoop yourself, then you use EBS drives. You can mount a bunch of drives (around 10-20 as I recall) on each node, and each drive can be up to 1 TB.

If you don't want to do it yourself, then look into EMR like monkeymatrix said.

Upvotes: 0

user1832464
user1832464

Reputation:

It sounds very much like you want to consider Elastic MapReduce which is a core AWS service based in Hadoop.

http://aws.amazon.com/elasticmapreduce/

You can specify your configuration and the cluster will launch for you - much easier than trying to configure EC2 instances yourself.

Upvotes: 1

Related Questions