sclee1
sclee1

Reputation: 1281

how to run multi node on the spark-shell?

I have been using spark-submit to test my codes on the multi-nodes system. (Of course, I specified the master option as the master server address to achieve multi-nodes environment). However, instead of using spark-submit, I would like to use spark-shell to test my codes on the cluster system. However, I don't know how to configure multi-nodes clusters settings on the spark-shell?

I think that just using spark-shell without changing any setups will results in the local mode.

I tried to search the info and followed the below commands.

scala> sc.stop()
...

scala> import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.{SparkContext, SparkConf}

scala> val sc = new SparkContext(new SparkConf().setAppName("shell").setMaster("my server address"))
...

scala> import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext

scala> val sqlContext = new SQLContext(sc)
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@567a2954

However, I am quite sure that I am doing right behavior for the multi-node cluster setup using spark-shell.

Upvotes: 2

Views: 2159

Answers (2)

OneCricketeer
OneCricketeer

Reputation: 191844

If you used setMaster("my server address")) and "my server address" is not "local", then it won't run in local mode.

It is fine to set the master address in the code, but in production, you'd set --master parameter on the CLI to spark-shell or spark-submit

You can also write a separate .scala file, and pass that to spark-shell -i <filename>.scala

Upvotes: 1

T. Gawęda
T. Gawęda

Reputation: 16086

Have you tried --master parameter of spark-shell? For Spark Standalone:

./spark-shell --master spark://master-ip:7077

Spark shell is just a driver, it will connect to any cluster you will write in master parameter

Edit:

For YARN use

./spark-shell --master yarn

Upvotes: 3

Related Questions