JohnZhang
JohnZhang

Reputation: 191

how to get container cpu/memory usage in hadoop yarn

I am new to hadoop/yarn, and need to get container resources consumption during task execution.

When I look at doc in apache hadoop website, it says "nodemanager... Is responsible for container,monitoring their resource usage(cpu,memory,disk,network) and report the same to the resourcemanager". My understanding is that node manager will periodically report resource usage together with heartbeat.

When I look at source code. In NodeStatusUpdaterImpl, totalResource is included in RegisterNodeManagerRequest.I think it is called when init nodemanager and tell RM about the configured resource. But in NodeHeartbeatRequest, the nodestatus only has container id, but no cpu memory etc.

So could you please help me clarify whether cpu memory used by container will be reported to RM? How I can get such data?

Many thanks!

Upvotes: 3

Views: 3489

Answers (1)

hakunami
hakunami

Reputation: 2441

This is the implementation of Container Monitor:

hadoop-2.6.0-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java

there are methods to check if a container is over the limitation, and this one isProcessTreeOverLimit will show you how yarn get the memory usage of certain container(process). I am not sure if there is a API we can use to get these info. But you can see this file

hadoop-2.6.0-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java

It shows you how Yarn gets memory usage: tracking process file in/proc. This answer will give you the command. I think it's possible to add certain code to get memory usage without Yarn API (I hope it has these APIs too).

Upvotes: 2

Related Questions