Reputation: 811
I have configured gridgain-hadoop-os-6.6.2.zip, and followed steps as mentioned in docs/hadoop_readme.pdf
. started gridgain using bin/ggstart.sh command, now am running a simple wordcount code in gridgain with hadoop-2.2.0. using command
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/*-mapreduce-examples-*.jar wordcount /input /output
Steps I tried:
Step 1: Extracted hadoop-2.2.0 and gridgain-hadoop-os-6.6.2.zip file in usr/local folder and changed name for gridgain folder as "gridgain".
Step 2: Set the path for export GRIDGAIN_HOME=/usr/local/gridgain.. and path for hadoop-2.2.0 with JAVA_HOME as
# Set Hadoop-related environment variables
export HADOOP_PREFIX=/usr/local/hadoop-2.2.0
export HADOOP_HOME=/usr/local/hadoop-2.2.0
export HADOOP_MAPRED_HOME=/usr/local/hadoop-2.2.0
export HADOOP_COMMON_HOME=/usr/local/hadoop-2.2.0
export HADOOP_HDFS_HOME=/usr/local/hadoop-2.2.0
export YARN_HOME=/usr/local/hadoop-2.2.0
export HADOOP_CONF_DIR=/usr/local/hadoop-2.2.0/etc/hadoop
export GRIDGAIN_HADOOP_CLASSPATH='/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*'
Step 3:
now i run command as bin/setup-hadoop.sh
... answer Y to every prompt.
Step 4:
started gridgain using command
bin/ggstart.sh
Step 5:
now i created dir and uploaded file using :
hadoop fs -mkdir /input
hadoop fs -copyFromLocal $HADOOP_HOME/README.txt /input/WORD_COUNT_ME.
txt
Step 6:
Running this command gives me error:
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/*-mapreduce-examples-*.
jar wordcount /input /output
Getting following error:
15/02/22 12:49:13 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
15/02/22 12:49:13 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_091ebfbd-2993-475f-a506-28280dbbf891_0002
15/02/22 12:49:13 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hduser/.staging/job_091ebfbd-2993-475f-a506-28280dbbf891_0002
java.lang.NullPointerException
at org.gridgain.client.hadoop.GridHadoopClientProtocol.processStatus(GridHadoopClientProtocol.java:329)
at org.gridgain.client.hadoop.GridHadoopClientProtocol.submitJob(GridHadoopClientProtocol.java:115)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:430)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:84)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
and gridgain console error as:
sLdrId=a0b8610bb41-091ebfbd-2993-475f-a506-28280dbbf891, userVer=0, loc=true, sampleClsName=java.lang.String, pendingUndeploy=false, undeployed=false, usage=0]], taskClsName=o.g.g.kernal.processors.hadoop.proto.GridHadoopProtocolSubmitJobTask, sesId=e129610bb41-091ebfbd-2993-475f-a506-28280dbbf891, startTime=1424589553332, endTime=9223372036854775807, taskNodeId=091ebfbd-2993-475f-a506-28280dbbf891, clsLdr=sun.misc.Launcher$AppClassLoader@1bdcbb2, closed=false, cpSpi=null, failSpi=null, loadSpi=null, usage=1, fullSup=false, subjId=091ebfbd-2993-475f-a506-28280dbbf891], jobId=f129610bb41-091ebfbd-2993-475f-a506-28280dbbf891]]
java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/JobContext
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2585)
at java.lang.Class.getConstructor0(Class.java:2885)
at java.lang.Class.getConstructor(Class.java:1723)
at org.gridgain.grid.hadoop.GridHadoopDefaultJobInfo.createJob(GridHadoopDefaultJobInfo.java:107)
at org.gridgain.grid.kernal.processors.hadoop.jobtracker.GridHadoopJobTracker.job(GridHadoopJobTracker.java:959)
at org.gridgain.grid.kernal.processors.hadoop.jobtracker.GridHadoopJobTracker.submit(GridHadoopJobTracker.java:222)
at org.gridgain.grid.kernal.processors.hadoop.GridHadoopProcessor.submit(GridHadoopProcessor.java:188)
at org.gridgain.grid.kernal.processors.hadoop.GridHadoopImpl.submit(GridHadoopImpl.java:73)
at org.gridgain.grid.kernal.processors.hadoop.proto.GridHadoopProtocolSubmitJobTask.run(GridHadoopProtocolSubmitJobTask.java:54)
at org.gridgain.grid.kernal.processors.hadoop.proto.GridHadoopProtocolSubmitJobTask.run(GridHadoopProtocolSubmitJobTask.java:37)
at org.gridgain.grid.kernal.processors.hadoop.proto.GridHadoopProtocolTaskAdapter$Job.execute(GridHadoopProtocolTaskAdapter.java:95)
at org.gridgain.grid.kernal.processors.job.GridJobWorker$2.call(GridJobWorker.java:484)
at org.gridgain.grid.util.GridUtils.wrapThreadLoader(GridUtils.java:6136)
at org.gridgain.grid.kernal.processors.job.GridJobWorker.execute0(GridJobWorker.java:478)
at org.gridgain.grid.kernal.processors.job.GridJobWorker.body(GridJobWorker.java:429)
at org.gridgain.grid.util.worker.GridWorker.run(GridWorker.java:151)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: Failed to load class: org.apache.hadoop.mapreduce.JobContext
at org.gridgain.grid.kernal.processors.hadoop.GridHadoopClassLoader.loadClass(GridHadoopClassLoader.java:125)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 20 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.JobContext
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at org.gridgain.grid.kernal.processors.hadoop.GridHadoopClassLoader.loadClassExplicitly(GridHadoopClassLoader.java:196)
at org.gridgain.grid.kernal.processors.hadoop.GridHadoopClassLoader.loadClass(GridHadoopClassLoader.java:106)
... 21 more
^[[B
Help here....
Edited Here :
raj@ubuntu:~$ hadoop classpath
/usr/local/hadoop-2.2.0/etc/hadoop:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/common/*:/usr/local/hadoop-2.2.0/share/hadoop/hdfs:/usr/local/hadoop-2.2.0/share/hadoop/hdfs/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/hdfs/*:/usr/local/hadoop-2.2.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/yarn/*:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce/*:/usr/local/hadoop-2.2.0/contrib/capacity-scheduler/*.jar
raj@ubuntu:~$ jps
3529 GridCommandLineStartup
3646 Jps
raj@ubuntu:~$ echo $GRIDGAIN_HOME
/usr/local/gridgain
raj@ubuntu:~$ echo $HADOOP_HOME
/usr/local/hadoop-2.2.0
raj@ubuntu:~$ hadoop version
Hadoop 2.2.0
Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768
Compiled by hortonmu on 2013-10-07T06:28Z
Compiled with protoc 2.5.0
From source with checksum 79e53ce7994d1628b240f09af91e1af4
This command was run using /usr/local/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar
raj@ubuntu:~$ cd /usr/local/hadoop-2.2.0/share/hadoop/mapreduce
raj@ubuntu:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce$ ls
hadoop-mapreduce-client-app-2.2.0.jar hadoop-mapreduce-client-hs-2.2.0.jar hadoop-mapreduce-client-jobclient-2.2.0-tests.jar lib
hadoop-mapreduce-client-common-2.2.0.jar hadoop-mapreduce-client-hs-plugins-2.2.0.jar hadoop-mapreduce-client-shuffle-2.2.0.jar lib-examples
hadoop-mapreduce-client-core-2.2.0.jar hadoop-mapreduce-client-jobclient-2.2.0.jar hadoop-mapreduce-examples-2.2.0.jar sources
raj@ubuntu:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce$
Upvotes: 0
Views: 4569
Reputation: 181
I configured exactly the versions you mentioned (gridgain-hadoop-os-6.6.2.zip + hadoop-2.2.0) -- the "wordcount" sample works fine.
[UPD after question's author log analysis:]
Raju, thanks for the detailed logs. The cause of the problem are incorrectly set env variables
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
You explicitly set all these variables to ${HADOOP_HOME} value, what is wrong. This causes GG to compose incorrect hadoop classpath, as seen from the below GG Node log:
+++ HADOOP_PREFIX=/usr/local/hadoop-2.2.0
+++ [[ -z /usr/local/hadoop-2.2.0 ]]
+++ '[' -z /usr/local/hadoop-2.2.0 ']'
+++ HADOOP_COMMON_HOME=/usr/local/hadoop-2.2.0
+++ HADOOP_HDFS_HOME=/usr/local/hadoop-2.2.0
+++ HADOOP_MAPRED_HOME=/usr/local/hadoop-2.2.0
+++ GRIDGAIN_HADOOP_CLASSPATH='/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*'
So, to fix the issue please don't set unnecessary env variables. JAVA_HOME and HADOOP_HOME is quite enough, nothing else is needed.
Upvotes: 2
Reputation: 811
many thnaks to Ivan, thanks for your help and support, the solution you gave was good to get me out of the problem.
The issue was not to set other hadoop related environment variables. this is enough to set.
JAVA_HOME , HADOOP_HOME and GRIDGAIN_HOME
Upvotes: 0