WestCoastProjects
WestCoastProjects

Reputation: 63231

Spark is not started automatically on the AWS cluster - how to launch it?

A spark cluster has been launched using the ec2/spark-ec2 script from within the branch-1.4 codebase. I have logged onto it.

I can login to it - and it reflects 1 master, 2 slaves:

11:35:10/sparkup2 $ec2/spark-ec2  -i ~/.ssh/hwspark14.pem  login hwspark14
Searching for existing cluster hwspark14 in region us-east-1...
Found 1 master, 2 slaves.
Logging into master ec2-54-83-81-165.compute-1.amazonaws.com...
Warning: Permanently added 'ec2-54-83-81-165.compute-1.amazonaws.com,54.83.81.165' (RSA) to the list of known hosts.
Last login: Tue Jun 23 20:44:05 2015 from c-73-222-32-165.hsd1.ca.comcast.net

       __|  __|_  )
       _|  (     /   Amazon Linux AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-ami/2013.03-release-notes/
Amazon Linux version 2015.03 is available.

But .. where are they?? The only java processes running are:

It is a surprise to me that the Spark Master and Workers are not started. When looking for the processes to start them manually it is not at all obvious where they are located.

Hints on

and

would be appreciated. (In the meantime i will do an exhaustive

 find / -name start-all.sh

And .. survey says:

root@ip-10-151-25-94 etc]$ find / -name start-all.sh
/root/persistent-hdfs/bin/start-all.sh
/root/ephemeral-hdfs/bin/start-all.sh

Which means to me that spark were not even installed??

Update I wonder: is this a bug in 1.4.0? I ran same set of commands in 1.3.1 and the spark cluster came up.

Upvotes: 2

Views: 217

Answers (1)

vvladymyrov
vvladymyrov

Reputation: 5793

There was a bug in spark 1.4.0 provisioning script which is cloned from github repository by spark-ec2 (https://github.com/mesos/spark-ec2/) with similar symptoms - apache spark haven't started. The reason was - provisioning script failed to download spark archive.

Check was spark downloaded and uncompressed on the master host ls -altr /root/spark there should be several directories there. From your description looks like /root/spark/sbin/start-all.sh script is missing - which is missing there.

Also check the contents of the file cat /tmp/spark-ec2_spark.log it should has information about uncompressing step.

Another thing to try is to run spark-ec2 with other provisioning script branch by adding --spark-ec2-git-branch branch-1.4 into the spark-ec2 command line argument.

Also when you run spark-ec2 save all output and check is there something suspicious:

spark-ec2 <...args...> 2>&1 | tee start.log

Upvotes: 2

Related Questions