how to run multi node on the spark-shell?

Question

I have been using spark-submit to test my codes on the multi-nodes system. (Of course, I specified the master option as the master server address to achieve multi-nodes environment). However, instead of using spark-submit, I would like to use spark-shell to test my codes on the cluster system. However, I don't know how to configure multi-nodes clusters settings on the spark-shell?

I think that just using spark-shell without changing any setups will results in the local mode.

I tried to search the info and followed the below commands.

scala> sc.stop()
...

scala> import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.{SparkContext, SparkConf}

scala> val sc = new SparkContext(new SparkConf().setAppName("shell").setMaster("my server address"))
...

scala> import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext

scala> val sqlContext = new SQLContext(sc)
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@567a2954

However, I am quite sure that I am doing right behavior for the multi-node cluster setup using spark-shell.

T. Gawęda · Accepted Answer

Have you tried --master parameter of spark-shell? For Spark Standalone:

./spark-shell --master spark://master-ip:7077

Spark shell is just a driver, it will connect to any cluster you will write in master parameter

Edit:

For YARN use

./spark-shell --master yarn

how to run multi node on the spark-shell?

Answers (2)

Related Questions