Reputation: 413
I'm reading "Hadoop The Definitive Guide" of 4th edition, and came across this explanation for YARN'S DRF (in Chapter 4, Dominant Resource Fairness)
Imagine a cluster with a total of 100 CPUs and 10 TB of memory. Application A requests containers of (2 CPUs, 300 GB), and application B requests containers of (6 CPUs, 100 GB). A’s request is (2%, 3%) of the cluster, so memory is dominant since its proportion (3%) is larger than CPU’s (2%). B’s request is (6%, 1%), so CPU is dominant. Since B’s container requests are twice as big in the dominant resource (6% versus 3%), it will be allocated half as many containers under fair sharing.
I cannot understand the meaning of it will be allocated half as many containers under fair sharing
. I guess it
here is Application B
, and Application B
is allocated half of the number of Application A's containers. Is it right? Why is Application B
allocated smaller containers even when it requires more resources?
Any suggestion and indication to some explanation document would be appreciated so much. Thank you in advance.
Upvotes: 10
Views: 6345
Reputation: 6343
Dominant Resource Calculator is based on concept of Dominant Resource Fairness (DRF).
To understand DRF, you can refer to the paper here: https://people.eecs.berkeley.edu/~alig/papers/drf.pdf
In this paper, refer to section 4.1, where an example is given.
DRF tries to equalise the dominant shares (Memory requirements of A = CPU requirements of B).
Explanation
Total Resouces Available
: 100 CPUs, 10000 GB Memory
Requirements of Application A
: 2 CPUs, 300 GB Memory
Requirements of Application B
: 6 CPUs, 100 GB Memory
A's dominant resource is Memory
(2% of CPUs vs 3% of Memory)
B's dominant resource is CPU
(6% of CPUs vs 1% of Memory)
Let's assume that "A" is assigned x
containers and "B" is assigned y
containers.
Resource requirements of A
2x CPUs + 300x GB Memory (2 CPUs and 300 GB Memory for each container)
Resource requirements of B:
6y CPUs + 100y GB Memory (6 CPUs and 100 GB Memory for each container)
Total requirement is:
2x + 6y <= 100 CPUs
300x + 100y <= 10000 GB Memory
DRF will try to equalise the dominant needs of A and B.
A's dominant need: 300x / 10000 GB (300x out of 10000 GB of total memory)
B's dominant need: 6y / 100 CPUs (6y out of 100 CPUs)
DRF will try to equalise: (300x / 10000) = (6y / 100)
Solving the above equation gives: x = 2y
If you substitute x = 2y
and solve the equations in step 3, you will get x=20 and y=10.
It means:
Application A is allocated 20 containers: (40 CPUs, 6000 GB of Memory)
Application B is allocated 10 containers: (60 CPUs, 1000 GB of memoty)
You can see that:
Total allocated CPU is:
40 + 60 <= 100 CPUs available
Total allocated Memory is:
6000 + 1000 <= 10000 GB of Memory available
So, the above solution explains the meaning of the sentence:
Since B’s container requests are twice as big in the dominant resource (6%
versus 3%), it will be allocated half as many containers under fair sharing.
Upvotes: 33