Techie
Techie

Reputation: 45124

What to choose yarn-cluster or yarn-client for a reporting platform?

What I'm planning to do is develop a reporting platform using existing data. I have an existing RDBMS which has large number of records. So I'm using. (Hadoop 2.7, Spark, Hive, JasperReports, Scoop - Architecuture)

Given that I have already read the following

Which mode should I use? Why? Decision is based on what?

Upvotes: 0

Views: 424

Answers (2)

Ravindra babu
Ravindra babu

Reputation: 38910

Adding some more info to Danier Darabos answer : Apart from hosting application/faillover and where Driver runs ( Application Master in yarn-cluster mode or Client in yarn-client mode, other features remains same. But yarn-client mode supports spark-shell unlike yarn-cluster mode.

enter image description here

Have a look at this article to know the difference between running Spark application in various modes - YARN Cluster , YARN Client & Spark Stand alone modes

Take a calculated decision after considering criteria in all options.

Upvotes: 1

Daniel Darabos
Daniel Darabos

Reputation: 27455

The decision is about whether you want your application to run as a YARN application or not.

A non-YARN application (which you get in yarn-client mode) is simpler. It's a classical Linux application, you can start it like any application and it runs on that machine like any application.

A YARN application (which you get in yarn-cluster mode) is managed by YARN. It runs on whatever machine YARN decides to put it on. If it dies, YARN will restart it, perhaps on a different machine. It is more robust (e.g. it will get restarted if the machine dies) but at the cost of complexity (e.g. you don't have a fixed IP address for the application).

I'd go with yarn-client at first. You can switch to yarn-cluster later if you find you need the features it provides.

Upvotes: 1

Related Questions