Spark (Yarn) applications started by Zeppelin in Yarn Cluster Mode aren't killed after zeppein is stopped

Question

I'm running Zeppelin 0.8.1 and have configured it to submit Spark jobs to a Yarn 2.7.5 cluster, with interpreters both in cluster-mode (as in the AM is running on yarn, and not on driver host), and in client-mode.

The yarn applications started in client mode are immediately killed after I stop the Zeppelin server. But, the jobs started in cluster mode become zombie-like, and start hogging all the resources in the Yarn cluster (No dynamic resource allocation).

Is there a way to make zeppelin kill those jobs upon exit? Or anything that solves this problem?

blackbishop · Accepted Answer

Starting from version 0.8, Zeppelin provides a parameter to shutdown idle interpreters by setting zeppelin.interpreter.lifecyclemanager.timeout.threshold.

See Interpreter Lifecycle Management

Before this I used a simple shell script that checks the running applications on yarn and kills them if idle for more than 1 hour:

max_life_in_mins=60

zeppelinApps=`yarn application -list 2>/dev/null | grep "RUNNING" | grep "Zeppelin Spark Interpreter" | awk '{print $1}'`

for jobId in $zeppelinApps
do
    finish_time=`yarn application -status $jobId 2>/dev/null | grep "Finish-Time" | awk '{print $NF}'`
    if [ $finish_time -ne 0 ]; then
      echo "App $jobId is not running"
      exit 1
    fi

    time_diff=`date +%s`-`yarn application -status $jobId 2>/dev/null | grep "Start-Time" | awk '{print $NF}' | sed 's!$!/1000!'`
    time_diff_in_mins=`echo "("$time_diff")/60" | bc`

    if [ $time_diff_in_mins -gt $max_life_in_mins ]; then
      echo "Killing app $jobId"
      yarn application -kill $jobId
    fi
done

There is also yarn REST API to do the same thing.

Spark (Yarn) applications started by Zeppelin in Yarn Cluster Mode aren't killed after zeppein is stopped

Answers (1)

Related Questions

Spark (Yarn) applications started by Zeppelin in Yarn Cluster Mode aren&#39;t killed after zeppein is stopped

Answers (1)

Related Questions

Spark (Yarn) applications started by Zeppelin in Yarn Cluster Mode aren't killed after zeppein is stopped