Reputation: 111
I ran Giraph 1.1.0 on Hadoop 2.6.0. The mapredsite.xml looks like this
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>The runtime framework for executing MapReduce jobs. Can be one of
local, classic or yarn.</description>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>4096</value>
<name>mapreduce.reduce.memory.mb</name>
<value>8192</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx3072m</value>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx6144m</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>4</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>4</value>
</property>
</configuration>
The giraph-site.xml looks like this
<configuration>
<property>
<name>giraph.SplitMasterWorker</name>
<value>true</value>
</property>
<property>
<name>giraph.logLevel</name>
<value>error</value>
</property>
</configuration>
I do not want to run the job in the local mode. I have also set environment variable MAPRED_HOME to be HADOOP_HOME. This is the command to run the program.
hadoop jar myjar.jar hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation /user/$USER/inputbc/inputgraph.txt /user/$USER/outputBC 1.0 1
When I run this code that computes betweenness centrality of vertices in a graph, I get the following exception
Exception in thread "main" java.lang.IllegalArgumentException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, you cannot run in split master / worker mode since there is only 1 task at a time!
at org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:168)
at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:236)
at hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation.runMain(BetweennessComputation.java:214)
at hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation.main(BetweennessComputation.java:218)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
What should I do to ensure that the job does not run in local mode?
Upvotes: 0
Views: 624
Reputation: 21
I have met the problem just a few days ago.Fortunately i solved it by doing this.
Modify the configuration file mapred-site.xml,make sure the value of property 'mapreduce.framework.name' to be 'yarn' and add the property 'mapreduce.jobtracker.address' which value is 'yarn' if there is not.
The mapred-site.xml looks like this:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobtracker.address</name>
<value>yarn</value>
</property>
</configuration>
Restart hadoop after modifying the mapred-site.xml.Then run your program and set the value which is after '-w' to be more than 1 and the value of 'giraph.SplitMasterWorker' to be 'true'.It will probably work.
As for the cause of the problem,I just quote somebody's saying: These properties are designed for single-node executions and will have to be changed when executing things in a cluster of nodes. In such a situation, the jobtracker has to point to one of the machines that will be executing a NodeManager daemon (a Hadoop slave). As for the framework, it should be changed to 'yarn'.
Upvotes: 2
Reputation: 1776
We can see that in the stack-trace where the configuration check in LocalJobRunner
fails this is a bit misleading because it makes us assume that we run in local model.You already found the responsible configuration option: giraph.SplitMasterWorker
but in your case you set it to true
. However, on the command-line with the last parameter 1
you specify to use only a single worker. Hence the framework decides that you MUST be running in local mode. As a solution you have two options:
giraph.SplitMasterWorker
to false
although you are running on a cluster.Increase the number of workers by changing the last parameter to the command-line call.
hadoop jar myjar.jar hu.elte.inf.mbalassi.msc.giraph.betweenness.BetweennessComputation /user/$USER/inputbc/inputgraph.txt /user/$USER/outputBC 1.0 4
Please refer also to my other answer at SO (Apache Giraph master / worker mode) for details on the problem concerning local mode.
Upvotes: 0
Reputation: 1
If you are after to split the master from the node you can use:
-ca giraph.SplitMasterWorker=true
also to specify the amount of workers you can use:
-w #
where "#" is the number of workers you want to use.
Upvotes: -1