How to avoid ExecutorFailure Error in Spark

How to avoid Executor Failures while Spark jobs are executing . We are using Spark 1.6 version as part of Cloudera CDH 5.10. Normally I am getting below error.

ExecutorLostFailure (executor 21 exited caused by one of the running tasks) Reason: Executor heartbeat timed out after 127100 ms

Upvotes: 1

Views: 538

Answers (1)

Rahul Sharma
Rahul Sharma

Reputation: 5834

There could be various reasons behind the slow tasks execution then it gets timeout, you need to drill down to find the rootcause. Sometimes tuning default timeout configuration parameters also helps. Go to spark UI configuration tab and find out values for below parameters then increase timeout parameters in spark-submit.

spark.worker.timeout
spark.network.timeout
spark.akka.timeout

Running job with speculative execution spark.speculation=true also helps, if one or more tasks are running slowly in a stage, they will be re-launched.

Explore more about spark 1.6.0 configuration properties.

Upvotes: 1

Related Questions