Mayank Porwal
Mayank Porwal

Reputation: 34086

How to run Spark Sql on a 10 Node cluster

I am using spark for the first time. I have setup spark on Hadoop 2.7 on a cluster with 10 nodes. On my master node, following are processes running:

hduser@hadoop-master-mp:~$ jps
20102 ResourceManager
19736 DataNode
20264 NodeManager
24762 Master
19551 NameNode
24911 Worker
25423 Jps

Now, I want to write Spark Sql to do a certain computation for 1 GB of file, which is already present in HDFS.

If I go into spark shell on my master node: spark-shell

and write the following query, will it just run on my master, or will it use all 10 nodes as workers?

scala> sqlContext.sql("CREATE TABLE sample_07 (code string,description string,total_emp int,salary int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TextFile")

If not, what do I have to do to make my Spark Sql use full cluster?

Upvotes: 0

Views: 1958

Answers (1)

Srinivasarao Daruna
Srinivasarao Daruna

Reputation: 3374

You need cluster manager to manage master and workers. You can go for either spark standalone or yarn or mesos cluster manager. I would suggest spark standalone cluster manager instead of yarn to just start the things.

To just start it up, Download spark distribution (pre-compiled for hadoop) on all the nodes and set Hadoop class path and other important configurations in spark-env.sh.

1) Start the master using /sbin/start-master.sh

it will create web interface with port (default 8080). Open the spark master web page and collect the spark master uri that is mentioned in the page.

2) go to all nodes, including the machine u started master, and run slave.

./sbin/start-slave.sh .

Check the master web page again. It should list all the workers on the page. If it hasnt listed then u need to find out the error from logs.

3) Please check the cores & memory that the machine has and the same shown on master web page for each worker. If they are not matching you can play with the commands to allocate them.

Go for spark 1.5.2 or later please follow the details here

As its just a starting point, let me know if u face any errors i can help u out.

Upvotes: 1

Related Questions