Reputation: 3270
I'm learning Spark but am confused if I have to run spark on Hadoop/Yarn or Mesos.
Is there any performance gain if I run on Hadoop/Mesos ?
Right now, am running just in standalone on a 4 node cluster.
Any experienced user who can provide some guidance here ?
Upvotes: 1
Views: 584
Reputation: 10428
Depending on the details of your use case, you may see performance go up and down in any given configuration compared to another. However Hadoop and Mesos give you other advantages than performance. There are many in each case but for example:
Hadoop
Mesos - Mesos is more focussed on a specific role than Hadoop, namely managing resources across a cluster of machines. However it does this across a range of Workload types. These could be data processing jobs such as Spark, distributed applications in Akka, distributed database etc. It can move tasks to other machines if a one machine fails.
I recommend watching this video, I was lucky enough to attend this meetup live: https://www.youtube.com/watch?v=gzx4-6RB7Yw
It demonstrates the use of Spark, HDFS, Mesos and Docker to do distributed computing on a cluster of Amazon cloud machines.
Upvotes: 4