Ben Haim Shani
Ben Haim Shani

Reputation: 265

Spark Failed- Futures timed out

I'm using apache spark 2.2.1, that running on Amazon EMR cluster. Sometimes jobs fail on 'Futures timed out':

java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:401)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:254)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:764)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:762)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)

I changed 2 params in spark-defaults.conf:

spark.sql.broadcastTimeout 1000
spark.network.timeout 10000000

but it didn't help.

Do you have any suggestions on how to handle this timeout?

Upvotes: 4

Views: 4731

Answers (1)

hba
hba

Reputation: 7800

Have you tried setting spark.yarn.am.waitTime?

Only used in cluster mode. Time for the YARN Application Master to wait for the SparkContext to be initialized.

The quote above is from here.

A bit more context on my situation:

I am using spark-submit to execute a java-spark job. I deploy the client to the cluster, and the client is doing a very long running operation which was causing a time out.

I got around it by:

spark-submit --master yarn --deploy-mode cluster --conf "spark.yarn.am.waitTime=600000" 

Upvotes: 1

Related Questions