peter
peter

Reputation: 684

With how many spark nodes should I use Mesos or Yarn?

I currently run a cluster with 4 spark nodes and 1 solr node. I want to expand the cluster quickly to 20 nodes and afterwards around 100. I am just not sure at what cluster size it would make sense to use Mesos or Yarn? Does it make sense to add Yarn or Mesos when I have less then 100 nodes?

Thanks

Upvotes: 0

Views: 511

Answers (1)

Sai Krishna
Sai Krishna

Reputation: 624

Mesos and YARN can scale upto thousands of nodes without any issue.

It is the the workload that decides what to be used, if your workload has jobs/tasks related to spark or hadoop only, YARN would be a better choice, else if you have Docker containers or something else to run then Mesos would be a better choice.

There are many other advantages and disadvantages using Mesos, please find them in the comparison here.

Spark standalone cluster will provide almost all the same features as the other cluster managers if you are only running Spark.

If you would like to run Spark alongside other applications, or to use richer resource scheduling capabilities (e.g. queues), both YARN and Mesos provide these features. Of these, YARN will likely be preinstalled in many Hadoop distributions.

If you have less than 100 nodes and you are not going to run any other applications alongside spark then spark standalone cluster would be a better choice as you would not be overkilling.

It again depends on the capabilities that you would like to use like the queues or schedulers like Fair scheduler then YARN/Mesos would make sense. (To use these features or not to use them depends on what you do with the spark cluster, workload and how busy your cluster is.)

Upvotes: 1

Related Questions