Reputation: 2361
I have a cluster with 50 nodes and each node has 8 cores for computation. If I have job to which I'm planning to impose 200 reducers, what would be good computational resource allocation strategy for better performance ?
I mean is it better to allocate 50 nodes and 4 cores on each of them or to allocate 25 nodes and 8 cores for each of them ? Which one is better in what case ?
Upvotes: 1
Views: 322
Reputation: 39893
To answer your question, it depends on a few things. The 50 nodes are going to be better in general, in my opinion:
However, if your main concern is network, here are the few downsides of having 50 nodes:
Even with these network concerns, I think you'll find that the 50 nodes is better, just because the value of a node is not just the number of cores. You have to consider mostly how many disks you have.
Upvotes: 1
Reputation: 20969
It is hard to say, usually it is always "the higher the better". More machines would be better to prevent failure.
Usually Hadoop is fine with commodity hardware and you can pick the 50 4 cores each servers.
But I would pick the 8 cores if they would have superior hardware, e.G. higher CPU frequency, DDR3 RAM or 10k rpm disks.
Upvotes: 1