Reputation: 93
I have 100 mapper and 1 reducer running in a job. How to improve the job performance?
As per my understanding: Use of combiner can improve the performance to great extent. But what else we need to configure to improve the jobs performance?
Upvotes: 0
Views: 2767
Reputation: 38950
With the limited data in this question ( Input file size, HDFS block size, Average map processing time, Number of Mapper slots & Reduce slots in cluster etc.), we can't suggest tips.
But there are some general guidelines to improve the performance.
Some more tips :
LongWritable
when range of output values are in Integer
range. IntWritable
is right choice in this case)Writables
Have a look at this cloudera article for some more tips.
Upvotes: 4