user3755282
user3755282

Reputation: 883

Spark on Mesos - Failing to fetch binary

Trying to run a Spark job on the Mesos cluster. Getting error while trying to fetch the binary.

Tried keeping binary on:

  1. HDFS
  2. local file system on slaves.

Used following path for SPARK_EXECUTOR_URI

Filesystem path - file://home/labadmin/spark-1.2.1.tgz

I0501 10:27:19.302435 30510 fetcher.cpp:214] Fetching URI 'file://home/labadmin/spark-1.2.1.tgz'
Failed to fetch: file://home/labadmin/spark-1.2.1.tgz
Failed to synchronize with slave (it's probably exited)

HDFS path with no port- hdfs://ipaddress/spark/spark-1.2.1.tgz

0427 09:23:21.616092  4842 fetcher.cpp:214] Fetching URI 'hdfs://ipaddress/spark/spark-1.2.1.tgz'
E0427 09:23:24.710765  4842 fetcher.cpp:113] HDFS copyToLocal failed: /usr/lib/hadoop/bin/hadoop fs -copyToLocal 'hdfs://ipaddress/spark/spark-1.2.1.tgz' '/tmp/mesos/slaves/20150427-054938-2933394698-5050-1030-S0/frameworks/20150427-054938-2933394698-5050-1030-0002/executors/20150427-054938-2933394698-5050-1030-S0/runs/5c13004a-3d8c-40a4-bac4-9c07249e1923/spark-1.2.1.tgz'
copyToLocal: Call From sclq174.lss.emc.com/ipaddress to sclq174.lss.emc.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

HDFS path with port 50070- hdfs://ipaddress:50070/spark/spark-1.2.1.tgz

I0427 13:34:25.295554 16633 fetcher.cpp:214] Fetching URI 'hdfs://ipaddress:50070/spark/spark-1.2.1.tgz'
E0427 13:34:28.438596 16633 fetcher.cpp:113] HDFS copyToLocal failed: /usr/lib/hadoop/bin/hadoop fs -copyToLocal 'hdfs://ipaddress:50070/spark/spark-1.2.1.tgz' '/tmp/mesos/slaves/20150427-054938-2933394698-5050-1030-S0/frameworks/20150427-054938-2933394698-5050-1030-0008/executors/20150427-054938-2933394698-5050-1030-S0/runs/2fc7886a-cfff-4cb2-b2f6-25988ca0f8e3/spark-1.2.1.tgz'
copyToLocal: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: 

Any ideas why its not working?

Upvotes: 0

Views: 680

Answers (1)

Tombart
Tombart

Reputation: 32378

Spark support different ways of fetching binaries:

  • file: - Absolute paths and file:/ URIs are served by the driver’s HTTP file server, and every executor pulls the file from the driver HTTP server.
  • hdfs:, http:, https:, ftp: - these pull down files and JARs from the URI as expected
  • local: - a URI starting with local:/ is expected to exist as a local file on each worker node

    1. file://home/labadmin/spark-1.2.1.tgz was not accessible from the driver. You probably wanted to use the local:/ URI.
    2. There's probably no HDFS server running on sclq174.lss.emc.com:8020
    3. The URI format is not recognized by Hadoop, you should replace hostname by an actual IP address in order to make this work, e.g. 192.168.1.1:50070.

Upvotes: 1

Related Questions