Reputation: 23

Spark loses workers

I have setup a hadoop cluster with 2 workers. Spark is installed and works with yarn. I start

$ pyspark or $ sparkR

and the api starts normally and can actually perform calculations but it loses it's workers after ~ 1 minute. I followed the instructions exactly according to this (https://cloud.google.com/solutions/monte-carlo-methods-with-hadoop-spark). After one minute of launching sparkR or pyspark I get this error

16/01/20 16:56:35 ERROR org.apache.spark.scheduler.cluster.YarnScheduler: Lost executor 2 on hadoopcluster-w-1
.c.hadoop-1196.internal: remote Rpc client disassociated
16/01/20 16:56:38 ERROR org.apache.spark.scheduler.cluster.YarnScheduler: Lost executor 1 on hadoopcluster-w-0.c
.hadoop-1196.internal: remote Rpc client disassociated

I have searched all over for a solution. I have seen lots of people say increase the spark.yarn.executorMemory but that did not work. I have recreated a brand new project to duplicate and got same issue. Can someone knowledgeable in spark try to create a cluster and run scripts by following the tutorial I posted above and suggest the fix? Thank you!

Upvotes: 0

Answers (2)

Anthony Tatum

Reputation: 23

Thanks for the replies. It turns out this is just a "harmless logspam due to the known Spark issue for dynamic allocation". See :

"https://issues.apache.org/jira/browse/SPARK-4134" and "Google Dataproc - disconnect with executors often"

Upvotes: 1

Radu Ionescu

Reputation: 3532

If running this

(sc.parallelize(1 to 4, 2)
    .map(i => playSession(100000, 100, 250000))
    .map(i => if (i == 0) 1 else 0)
    .reduce(_+_)/4.)

will not give you any errors, it means that your issue is caused by memory (and you will not be able to fix it by changing the settings of your cluster)

Upvotes: 0

Spark loses workers

Answers (2)

Related Questions