anand011090
anand011090

Reputation: 65

What is Unmanaged Application Master and its role in the yarn federation hadoop?

I am not getting much information about working of Unmanaged AM. I just know the basic definition about it but still not sure how their management is done and by whom it is done?

Also in apache document, it is mentioned (point 8 in job execution flow)- "Based on a policy the AMRMProxy can impersonate the AM on other sub-clusters, by submitting an Unmanaged AM, and by forwarding the AM heartbeats to relevant sub-clusters. a. Federation supports multiple application attempts with AMRMProxy HA. AM containers will have different attempt id in home sub-cluster, but the same Unmanaged AM in secondaries will be used across attempts. b. When AMRMProxy HA is enabled, UAM token will be stored in Yarn Registry. In the registerApplicationMaster call of each application attempt, AMRMProxy will go fetch existing UAM tokens from registry (if any) and re-attached to the existing UAMs."

Thanks in advance for the detailed explaination.

Upvotes: 1

Views: 283

Answers (1)

eval
eval

Reputation: 1239

UAM was introduced in https://issues.apache.org/jira/browse/YARN-420. The original purpose was

It would be a useful improvement to enhance this model by allowing the AM to be launched independently by the client without requiring the RM. These AM's would be launched on a gateway machine that can talk to the cluster. This would open up new use cases such as the following

1) Easy debugging of AM, specially during initial development. Having the AM launched on an arbitrary cluster node makes it hard to looks at logs or attach a debugger to the AM. If it can be launched locally then these tasks would be easier.

2) Running AM's that need special privileges that may not be available on machines managed by the NodeManager

Now UAM also has an important role in the design of Yarn federation. In the way you quoted.

AMRMProxy can impersonate the AM on other sub-clusters, by submitting an Unmanaged AM, and by forwarding the AM heartbeats to relevant sub-clusters.

The idea is simple. In federation, only one "real" AM is created after application's submission (in home sub-cluster). However in order to allocate task containers in other (secondary) sub-clusters, the application needs AMs in secondary sub-clusters to talk to their RMs. Yarn federation solves this problem by registering UAM in sub-cluster of which the job is just to forwarding the allocate heartbeat.

You can take a look at the code

FederationInterceptor
    sendRequestsToResourceManagers()

UnmanagedAMPoolManager
UnmanagedApplicationManager

Upvotes: 1

Related Questions