Reputation:
I have got 1TB of hive data.I want process the data within 2 hours...And the hadoop cluster will not grow because it doesn't have user interaction. How much RAM and cpu is required for each machine if I want to have 3 running machines
Upvotes: 1
Views: 151
Reputation: 10428
This is dependent on the complexity of your process. A simple word count will surely complete before a complex data science algorithm. Your choice of implementation (e.g. Map-Reduce vs Spark) will also influence execution time.
For any given hardware specification, some processes may complete while others may miss the deadline. You won't get a complete answer without giving more details about your workload (and even then the answer will probably be a recommendation to run practical experiments with your particular process). However, I can say that when sizing a cluster, there are two resources I tend to reference:
http://blog.cloudera.com/blog/2013/08/how-to-select-the-right-hardware-for-your-new-hadoop-cluster/
The cloudera blog in particular discussed different hardware requirement depending on whether your workload is storage intensive, compute intensive, etc.
Upvotes: 2