Yasir K
Yasir K

Reputation: 31

Why does Spark job fail on Mesos with "hadoop: not found"?

I use Spark 1.6.1, Hadoop 2.6.4 and Mesos 0.28 on Debian 8.

While trying to submit a job via spark-submit to a Mesos cluster a slave fails with the following in stderr log:

I0427 22:35:39.626055 48258 fetcher.cpp:424] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/ad642fcf-9951-42ad-8f86-cc4f5a5cb408-S0\/hduser","items":[{"action":"BYP$
I0427 22:35:39.628031 48258 fetcher.cpp:379] Fetching URI 'hdfs://xxxxxxxxx:54310/sources/spark/SimpleEventCounter.jar'
I0427 22:35:39.628057 48258 fetcher.cpp:250] Fetching directly into the sandbox directory
I0427 22:35:39.628078 48258 fetcher.cpp:187] Fetching URI 'hdfs://xxxxxxx:54310/sources/spark/SimpleEventCounter.jar'
E0427 22:35:39.629243 48258 shell.hpp:93] Command 'hadoop version 2>&1' failed; this is the output:
sh: 1: hadoop: not found
Failed to fetch 'hdfs://xxxxxxx:54310/sources/spark/SimpleEventCounter.jar': Failed to create HDFS client: Failed to execute 'hadoop version 2>&1'; the command was e$
Failed to synchronize with slave (it's probably exited)
  1. My Jar file contains hadoop 2.6 binaries
  2. The path to spark executor/binary is via an hdfs:// link

My jobs don't appear in the framework tab, but they do appear in the driver with the status 'queued' and they just sit there till I shut down the spark-mesos-dispatcher.sh service.

Upvotes: 1

Views: 1385

Answers (1)

Hardy
Hardy

Reputation: 477

I was seeing a very similar error and I figured out my problem was that hadoop_home wasn't set in the mesos agent. I added to /etc/default/mesos-slave (path may be different on your install) on each mesos-slave the following line: MESOS_hadoop_home="/path/to/my/hadoop/install/folder/"

EDIT: Hadoop has to be installed on each slave, the path/to/my/haoop/install/folder is a local path

Upvotes: 0

Related Questions