Reputation: 451
I have the following clusters with overlapping EC2 instances, for example: Yarn cluster and Memcached cluster are using same instances 2, 3, 4; furthermore, each instance has different RAM, CPU, Core sizes, will this potentially courses problem? or the clusters can do the balance by themselves? Thank you!
Spark cluster: EC2 instances 2, 3, 5
Yarn cluster: EC2 instances 1, 2, 3, 4, 5
Memcached database cluster: EC2 instances 2, 3, 4, 6
instance 1: 512GB RAM, 2 vCPU, 2 cores
instance 2: 1TB RAM, 8 vCPU, 4 cores
instance 3: 2TB RAM, 6 vCPU, 6 cores
instance 4: 256GB RAM, 2 vCPU, 2 cores
instance 5: 2TB RAM, 16 vCPU, 4 cores
instance 6: 4TB RAM, 4 vCPU, 8 cores
Upvotes: 2
Views: 73
Reputation: 841
Clusters are not aware of this sharing; you need to configure resource allocations per host to avoid over-commitments.
If, for any node total resource allocation makes more than all RAM/Cores/Disk available, you are at risk (most often, at risk of spark task or yarn child being unable to start). Like, for instance 3 you can't allocate 1T to each service.
Like, at instance 3 you can't allocate 1T to each service.
As a side note, Spark can be run on yarn, so there is an options to reduce this to two clusters.
Upvotes: 1