Freya Ren
Freya Ren

Reputation: 2164

Can I make my Hadoop reducer quicker?

I'm newbie to Hadoop, and just trying the wordcount example. I just build a single node referring to http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

I upload a very simple text with few words to HDFS, and run the wordcount.jar.

Somehow it takes very very long time for reducer to process. I know it is the I/O bottleneck, but are there any ways I can set some parameters and make it faster? (lol, the reduce process is still 0%, almost 20 minutes)

13/06/04 15:53:14 INFO mapred.JobClient:  map 100% reduce 0%

Upvotes: 0

Views: 271

Answers (2)

Vbp
Vbp

Reputation: 1982

If you want to modify some Hadoop settings like increasing the number of Reduce tasks, you can use the "-D" option:

hduser@ubuntu:/usr/local/hadoop$ bin/hadoop jar hadoop*examples*.jar wordcount -D mapred.reduce.tasks=8 /user/hduser/temp-data /user/hduser/temp-data-output

Moreover with HDFS you cannot force number of map tasks mapred.map.tasks but you can specify mapred.reduce.tasks as explained in this link

Upvotes: 1

Kun Ling
Kun Ling

Reputation: 2219

It seems your Hadoop have some issues, and the MR could not run correctly.

Please check:

  1. Whether your Hadoop work correctly by access http://localhost:50030, which is the JobTracker WebUI of your hadoop
  2. Look into the log files on your $HADOOP_HOME/logs/, especially the *jobtracker*.log, and *tasktracker*.log.

Usually, if it is your first time testing Hadoop. Please check this link: Hadoop WordCount example stuck at map 100% reduce 0%

Upvotes: 0

Related Questions