Maz
Maz

Reputation: 193

Spark Shell on Yarn getting connection refused error

I have Hadoop 2.7.0 and Spark 2.0.0 on Ubuntu 14.04. I have one master node and tw slave nodes. All daemons have started well. When I start spark-shell without Yarn, the following runs fine

scala> val inputRDD = sc.textFile("/spark_examples/war_and_peace.txt")
inputRDD: org.apache.spark.rdd.RDD[String] = /spark_examples/war_and_peace.txt MapPartitionsRDD[1] at textFile at <console>:24

scala> inputRDD.collect
res0: Array[String] = Array(The Project Gutenberg EBook of War and Peace, by Leo Tolstoy, "", This eBook is for the use of anyone anywhere at no cost and with almost, no restrictions whatsoever.  You may copy it, give it away or re-use it, under the terms of the Project Gutenberg License included with this, eBook or online at www.gutenberg.org, "", "", Title: War and Peace, "", Author: Leo Tolstoy, "", Translators: Louise and Aylmer Maude, "", Posting Date: January 10, 2009 [EBook #2600], "", Last Updated: March 15, 2013, "", Language: English, "", Character set encoding: ASCII, "", *** START OF THIS PROJECT GUTENBERG EBOOK WAR AND PEACE ***, "", An Anonymous Volunteer, and David Widger, "", "", "", "", "", WAR AND PEACE, "", By Leo Tolstoy/Tolstoi, "", CONTENTS, "", BOOK ONE: 1805, "",...
scala> 

But when I start spark-shell with Yarn, it throws the following error

scala> val inputRDD = sc.textFile("/spark_examples/war_and_peace.txt")
inputRDD: org.apache.spark.rdd.RDD[String] = /spark_examples/war_and_peace.txt MapPartitionsRDD[1] at textFile at <console>:24

scala> inputRDD.collect
[Stage 0:>                                                          (0 + 2) / 2]17/04/03 21:31:04 ERROR shuffle.RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks 
java.io.IOException: Failed to connect to HadoopSlave2/192.168.78.136:44749
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
    at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:96)
    at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
    at org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120)
    at org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:105)
    at org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:92)
    at org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:554)
    at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:76)
    at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
    at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1857)
    at org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:56)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: HadoopSlave2/192.168.78.136:44749
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    ... 1 more

Have I missed anything in the configuration?

Upvotes: 1

Views: 2062

Answers (1)

RBanerjee
RBanerjee

Reputation: 947

Spark uses random ports for internal communications between driver and executors, which might be blocked by your firewall. Try opening ports between your cluster nodes.

You can also use fixed port using this if you are strict about firewall rules even within the cluster,

val conf = new SparkConf() 
    .setMaster(master) 
    .setAppName("namexxx") 
    .set("spark.driver.port", "51810") 
    .set("spark.fileserver.port", "51811") 
    .set("spark.broadcast.port", "51812") 
    .set("spark.replClassServer.port", "51813") 
    .set("spark.blockManager.port", "51814") 
    .set("spark.executor.port", "51815")  

Refer here, https://stackoverflow.com/a/30036642/3080158

Upvotes: 1

Related Questions