3xCh1_23
3xCh1_23

Reputation: 1499

Failed to bind to: spark-master, using a remote cluster with two workers

I am managing to get everything working with the local master and two remote workers. Now, I want to connect to a remote master that has the same remote workers. I have tried different combinations of settings withing the /etc/hosts and other reccomendations on the Internet, but NOTHING worked.

The Main class is:

public static void main(String[] args) {
    ScalaInterface sInterface = new ScalaInterface(CHUNK_SIZE,
            "awsAccessKeyId",
            "awsSecretAccessKey");

    SparkConf conf = new SparkConf().setAppName("POC_JAVA_AND_SPARK")
            .setMaster("spark://spark-master:7077");

    org.apache.spark.SparkContext sc = new org.apache.spark.SparkContext(
            conf);

    sInterface.enableS3Connection(sc);
    org.apache.spark.rdd.RDD<Tuple2<Path, Text>> fileAndLine = (RDD<Tuple2<Path, Text>>) sInterface.getMappedRDD(sc, "s3n://somebucket/");

    org.apache.spark.rdd.RDD<String> pInfo = (RDD<String>) sInterface.mapPartitionsWithIndex(fileAndLine);

    JavaRDD<String> pInfoJ = pInfo.toJavaRDD();

    List<String> result = pInfoJ.collect();

    String miscInfo = sInterface.getMiscInfo(sc, pInfo);

    System.out.println(miscInfo);

}

It fails at:

List<String> result = pInfoJ.collect();

The error I am getting is:

1354 [sparkDriver-akka.actor.default-dispatcher-3] ERROR akka.remote.transport.netty.NettyTransport  - failed to bind to spark-master/192.168.0.191:0, shutting down Netty transport
1354 [main] WARN  org.apache.spark.util.Utils  - Service 'sparkDriver' could not bind on port 0. Attempting port 1.
1355 [main] DEBUG org.apache.spark.util.AkkaUtils  - In createActorSystem, requireCookie is: off
1363 [sparkDriver-akka.actor.default-dispatcher-3] INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator  - Shutting down remote daemon.
1364 [sparkDriver-akka.actor.default-dispatcher-3] INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator  - Remote daemon shut down; proceeding with flushing remote transports.
1364 [sparkDriver-akka.actor.default-dispatcher-5] INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator  - Remoting shut down.
1367 [sparkDriver-akka.actor.default-dispatcher-4] INFO  akka.event.slf4j.Slf4jLogger  - Slf4jLogger started
1370 [sparkDriver-akka.actor.default-dispatcher-6] INFO  Remoting  - Starting remoting
1380 [sparkDriver-akka.actor.default-dispatcher-4] ERROR akka.remote.transport.netty.NettyTransport  - failed to bind to spark-master/192.168.0.191:0, shutting down Netty transport
Exception in thread "main" 1382 [sparkDriver-akka.actor.default-dispatcher-6] INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator  - Shutting down remote daemon.
1382 [sparkDriver-akka.actor.default-dispatcher-6] INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator  - Remote daemon shut down; proceeding with flushing remote transports.
java.net.BindException: Failed to bind to: spark-master/192.168.0.191:0: Service 'sparkDriver' failed after 16 retries!
    at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
    at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:393)
    at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:389)
    at scala.util.Success$$anonfun$map$1.apply(Try.scala:206)
    at scala.util.Try$.apply(Try.scala:161)
    at scala.util.Success.map(Try.scala:206)
    at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
    at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
    at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
    at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
    at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
    at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
    at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
    at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
1383 [sparkDriver-akka.actor.default-dispatcher-7] INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator  - Remoting shut down.
1385 [delete Spark temp dirs] DEBUG org.apache.spark.util.Utils  - Shutdown hook called

Thank you kindly for your help!

Upvotes: 23

Views: 23585

Answers (4)

AvkashChauhan
AvkashChauhan

Reputation: 20571

I had spark working in my EC2 instance. I started a new web server and to meet its requirement I had to change hostname to ec2 public DNS name i.e.

hostname ec2-54-xxx-xxx-xxx.compute-1.amazonaws.com

After that my spark could not work and showed error as below:

16/09/20 21:02:22 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1. 16/09/20 21:02:22 ERROR SparkContext: Error initializing SparkContext.

I solve it by setting SPARK_LOCAL_IP to as below:

export SPARK_LOCAL_IP="localhost"

then just launched sparkling shell as below:

$SPARK_HOME/bin/spark-shell

Upvotes: 5

Travis Carlson
Travis Carlson

Reputation: 671

Setting the environment variable SPARK_LOCAL_IP=127.0.0.1 solved this for me.

Upvotes: 55

Costi Ciudatu
Costi Ciudatu

Reputation: 38195

I had this problem when my /etc/hosts file was mapping the wrong IP address to my local hostname.

The BindException in your logs complains about the IP address 192.168.0.191. I assume that resolves to the hostname of your machine and it's not the actual IP address that your network interface is using. It should work fine once you fix that.

Upvotes: 12

ayan guha
ayan guha

Reputation: 1257

Possily your master is running on non-default port. Can you post your submit command? Have a look in https://spark.apache.org/docs/latest/spark-standalone.html#connecting-an-application-to-the-cluster

Upvotes: 1

Related Questions