Reputation: 2486
I submit a spark stream calculation task to my stand alone spark cluster. The submit command is like below:
./bin/spark-submit \
--master spark://ES01:7077 \
--executor-memory 4G --num-executors 1\
/opt/flowSpark/sparkStream/latest5min.py 1>a.log 2>b.log
Note that I use num-executors 1. Because I only want one executor.
Then with ps comand I can find below output.
[root@ES01 ~]# ps -ef | grep java | grep -v grep | grep spark
root 11659 1 0 Apr19 ? 00:48:25 java -cp /opt/spark-1.6.0-bin-hadoop2.6/conf/:/opt/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/hadoop-2.6.2/etc/hadoop/ -Xms4G -Xmx4G -XX:MaxPermSize=256m org.apache.spark.deploy.master.Master --ip ES01 --port 7077 --webui-port 8080
root 11759 1 0 Apr19 ? 00:42:59 java -cp /opt/spark-1.6.0-bin-hadoop2.6/conf/:/opt/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/hadoop-2.6.2/etc/hadoop/ -Xms4G -Xmx4G -XX:MaxPermSize=256m org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://ES01:7077
root 18538 28335 38 16:13 pts/1 00:01:52 java -cp /opt/spark-1.6.0-bin-hadoop2.6/conf/:/opt/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/hadoop-2.6.2/etc/hadoop/ -Xms1g -Xmx1g -XX:MaxPermSize=256m org.apache.spark.deploy.SparkSubmit --master spark://ES01:7077 --executor-memory 4G --num-executors 1 /opt/flowSpark/sparkStream/latest5min.py
root 18677 11759 46 16:13 ? 00:02:14 java -cp /opt/spark-1.6.0-bin-hadoop2.6/conf/:/opt/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/hadoop-2.6.2/etc/hadoop/ -Xms4096M -Xmx4096M -Dspark.driver.port=55652 -XX:MaxPermSize=256m org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://[email protected]:55652 --executor-id 0 --hostname 10.79.148.184 --cores 1 --app-id app-20160509161303-0048 --worker-url spark://[email protected]:35012
root 18679 11759 46 16:13 ? 00:02:13 java -cp /opt/spark-1.6.0-bin-hadoop2.6/conf/:/opt/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/hadoop-2.6.2/etc/hadoop/ -Xms4096M -Xmx4096M -Dspark.driver.port=55652 -XX:MaxPermSize=256m org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://[email protected]:55652 --executor-id 1 --hostname 10.79.148.184 --cores 1 --app-id app-20160509161303-0048 --worker-url spark://[email protected]:35012
root 18723 11759 47 16:13 ? 00:02:14 java -cp /opt/spark-1.6.0-bin-hadoop2.6/conf/:/opt/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/hadoop-2.6.2/etc/hadoop/ -Xms4096M -Xmx4096M -Dspark.driver.port=55652 -XX:MaxPermSize=256m org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://[email protected]:55652 --executor-id 2 --hostname 10.79.148.184 --cores 1 --app-id app-20160509161303-0048 --worker-url spark://[email protected]:35012
From my understanding
11659 and 11759 is spark stand cluster process.
18538 is the driver program.
18677 18679 18723 should be worker process now.
Why there are still 3 since I already use the num-executor 1 ?
Upvotes: 0
Views: 768
Reputation: 500
If you are using YARN , you can check the executor by issuing below command in the datanode ( where executors will be instantiated)
$ sudo -u yarn jps
11388 CoarseGrainedExecutorBackend
1854 Jps
11396 CoarseGrainedExecutorBackend
CoarseGrainedExecutorBackend refers one executor.
Upvotes: 0
Reputation: 844
Check spark.executor.cores in your spark defaults, from the documentation
The number of cores to use on each executor. For YARN and standalone mode only.
In standalone mode, setting this parameter allows an application to run multiple executors on the same worker, provided that there are enough cores on that worker.
Otherwise, only one executor per application will run on each worker.
http://spark.apache.org/docs/latest/configuration.html#execution-behavior
Upvotes: 1