Reputation: 637
Our Spark Job sometimes stops for an unknown reason to us.
The only clue that we potentially see in the logs are the repeated log statements of failed: Set()
as illustrated below.
Any ideas as to why the messages are displayed below would be greatly appreciated.
18/02/08 22:12:14 INFO Executor: Finished task 0.0 in stage 51.0 (TID 38). 2008 bytes result sent to driver
18/02/08 22:12:14 INFO TaskSetManager: Finished task 0.0 in stage 51.0 (TID 38) in 312094 ms on localhost (executor driver) (1/1)
18/02/08 22:12:14 INFO TaskSchedulerImpl: Removed TaskSet 51.0, whose tasks have all completed, from pool
18/02/08 22:12:14 INFO DAGScheduler: ShuffleMapStage 51 (rdd at EsSparkSQL.scala:97) finished in 602.298 s
18/02/08 22:12:14 INFO DAGScheduler: looking for newly runnable stages
18/02/08 22:12:14 INFO DAGScheduler: running: Set(ShuffleMapStage 60, ShuffleMapStage 1, ShuffleMapStage 19, ShuffleMapStage 6)
18/02/08 22:12:14 INFO DAGScheduler: waiting: Set(ShuffleMapStage 45, ShuffleMapStage 16, ShuffleMapStage 3, ShuffleMapStage 18, ShuffleMapStage 39, ShuffleMapStage 10, ShuffleMapStage 55, ShuffleMapStage 62, ShuffleMapStage 41, ResultStage 63, ShuffleMapStage 49, ShuffleMapStage 5, ShuffleMapStage 35, ShuffleMapStage 42, ShuffleMapStage 21, ShuffleMapStage 57, ShuffleMapStage 14, ShuffleMapStage 29)
18/02/08 22:12:14 INFO DAGScheduler: failed: Set()
18/02/08 22:13:33 INFO JDBCRDD: closed connection
18/02/08 22:13:33 INFO Executor: Finished task 0.0 in stage 60.0 (TID 44). 2008 bytes result sent to driver
18/02/08 22:13:33 INFO TaskSetManager: Finished task 0.0 in stage 60.0 (TID 44) in 196274 ms on localhost (executor driver) (1/1)
18/02/08 22:13:33 INFO TaskSchedulerImpl: Removed TaskSet 60.0, whose tasks have all completed, from pool
18/02/08 22:13:33 INFO DAGScheduler: ShuffleMapStage 60 (rdd at EsSparkSQL.scala:97) finished in 681.143 s
18/02/08 22:13:33 INFO DAGScheduler: looking for newly runnable stages
18/02/08 22:13:33 INFO DAGScheduler: running: Set(ShuffleMapStage 1, ShuffleMapStage 19, ShuffleMapStage 6)
18/02/08 22:13:33 INFO DAGScheduler: waiting: Set(ShuffleMapStage 45, ShuffleMapStage 16, ShuffleMapStage 3, ShuffleMapStage 18, ShuffleMapStage 39, ShuffleMapStage 10, ShuffleMapStage 55, ShuffleMapStage 62, ShuffleMapStage 41, ResultStage 63, ShuffleMapStage 49, ShuffleMapStage 5, ShuffleMapStage 35, ShuffleMapStage 42, ShuffleMapStage 21, ShuffleMapStage 57, ShuffleMapStage 14, ShuffleMapStage 29)
18/02/08 22:13:33 INFO DAGScheduler: failed: Set()
18/02/08 22:28:02 INFO JDBCRDD: closed connection
Upvotes: 7
Views: 5525
Reputation: 406
According to Spark 1.0.2 (also 1.1.0) hangs on a partition, the line DAGScheduler: failed: Set()
means that the set of failed stages is empty.
Upvotes: 10