Reputation: 41
I would like to get a better understanding of the communication exchange between YARN and Spark. For example:
Upvotes: 1
Views: 466
Reputation: 2214
Steps done when we run spark-submit on Yarn client mode -
Spark driver internally invokes Client
class submitApplication
method. This submits a Spark application to a YARN cluster (i.e. to the YARN ResourceManager) and returns the application’s ApplicationId.
After this, spark uses the application_id generated in step 1 and calls createContainerLaunchContext method. This method creates a YARN ContainerLaunchContext request for YARN NodeManager to launch ApplicationMaster (in a container).
Step 2 is responsible for launching an ApplicationMaster for the application. If the cluster dont have resources to start an AM, then it will fail and driver will shut down with an exception. Once the AM is up and running, it contacts the driver and that it is up. At this point the spark yarn application is UP and running.
After this driver asks for resources (executors) to AM which then asks the same to Yarn ResourceManager.
If the yarn doesn't have that much capacity, it will give whatever is possible to the Spark Application. If it has capacity, it will give whatever is asked for.
More details here - https://jaceklaskowski.gitbooks.io/mastering-apache-spark/yarn/spark-yarn-client.html
Upvotes: 2