Reputation: 442
I have a Mesos cluster setup -- I have verified that the master can see the slaves -- but when I attempt to run a Hadoop job, all tasks wind up with a status of LOST. The same error is present in all the slave stderr logs:
Error: Could not find or load main class org.apache.hadoop.mapred.MesosExecutor
and that is the only line in the stderr logs.
Following the instructions on http://mesosphere.io/learn/run-hadoop-on-mesos/, I have put a modified Hadoop distribution on HDFS which each slave can access.
In the lib
directory of the Hadoop distribution, I have added hadoop-mesos-0.0.4.jar
and mesos-0.14.2.jar
.
I have verified that each slave does in fact download this Hadoop distribution, and that hadoop-mesos-0.0.4.jar
contains the class org.apache.hadoop.mapred.MesosExecutor
, so I cannot figure out why the class cannot be found.
I am using Hadoop from CDH4.4.0 and mesos-0.15.0-rc4.
Does any one have any suggestions as to what might be the problem? I know I would always start with a CLASSPATH
problem, but, in this case, the mesos-slave is downloading, unpacking, and attempting to run a Hadoop TaskTracker so I would imagine any CLASSPATH
would be setup by the mesos-slave.
In the stdout of the slave logs, the environment is printed. There is a MESOS_HADOOP_HOME
which is empty. Should this be set to something? If it is supposed to be set to the downloaded Hadoop distribution, I cannot set it in advance because the Hadoop distribution is downloaded to a new location every time.
In the event that is related (some permissions issue maybe), when attempting to browse slave logs via the master UI, I get the error Error browsing path: ...
.
The user running mesos-slave can browse to the correct directory when I do so manually.
Upvotes: 2
Views: 900
Reputation: 442
I found the problem. bin/hadoop
of the downloaded Hadoop distribution attempts to find its location by running which $0
. However, that will find a current Hadoop installation if one exists (i.e. /usr/lib/hadoop
), and will load the jars under that installation's lib
directory instead of the downloaded one's lib
directory.
I had to modify bin/hadoop
of the downloaded distribution to find its own location with dirname $0
instead of which $0
.
Upvotes: 3