osr
osr

Reputation: 21

Spark Executors - Are they java processes?

I am new to spark. When I try to run spark-submit in client mode with 3 executors , I expect 3 java processes (since there are 3 executors ) to show up when I execute ps -ef

$SPARK_HOME/bin/spark-submit --num-executors 3 --class AverageCalculation --master local[1] /home/customer/SimpleETL/target/SimpleETL-0.1.jar hdfs://node1:9000/home/customer/SimpleETL/standard_input.csv

But, I dont see 3 java processes. My undrstanding is that each executor process is a java process. Please advise. Thanks.

Upvotes: 2

Views: 3632

Answers (4)

Radhakrishnan Rk
Radhakrishnan Rk

Reputation: 561

Each executors are a java process. Each executors comprises a jvm.

jps

Number of java process is same as the number of executors. If the executors are distributed across the worker nodes. Need to check the process the corresponding worker nodes. We can get the information about executors and where it has been launched from spark history server web UI.

Upvotes: 2

osr
osr

Reputation: 21

/home/spark/spark-2.2.1-bin-hadoop2.7/bin/spark-submit --class org.apache.spark.examples.SparkPi \
    --num-executors 1000 \
    --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 2g --executor-cores 1 \
    --queue default /home/spark/spark-2.2.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.2.1.jar

~ I executed the command above. and checked ps -ef | grep java. But I dont see a lot of java processes . Any easy way to identify the executors ?

Upvotes: 0

Shrinivas Deshmukh
Shrinivas Deshmukh

Reputation: 697

In Spark, there are master nodes and worker nodes. Executors run on worker nodes in their own java processes.

In your spark-submit you can add --deploy-mode cluster and see that executors are running on worker nodes in their own JVM instances.

You can check this answer for detailed workflow of Apache Spark.

Upvotes: 0

Alper t. Turker
Alper t. Turker

Reputation: 35229

Because you use local mode (--master local[1]) executor settings are not applicable. In this case, spark starts only a single JVM to emulate all components, and allocates number of threads specified in local definition (1) as executor threads.

In other modes, exectuors are separate JVM instances.

Upvotes: 6

Related Questions