Reputation: 17256
I have a simple Ruby app, basically it gets some data via HTTP endpoint, processes it a little, groups it and sends it in batches to some remote HTTP endpoint.
When I run this on bare-metal - I saturate 4 CPUs to 100% and get about 3000reqs/s
(according to ab
; the app is a bit computationally intensive);
but when I run it in Docker I get only 1700reqs/s
- CPUs seem to peak at about 55-65%. The same app, the same settings.
I tried increasing ab's concurrency. The app itself is hosted in Passenger, I tried running it in 20 processes, in 40 processes (Passenger runs the app). Inside Docker it doesn't seem to be wanting to go higher.
I run it via docker-compose
, the host is Ubuntu 14.04
$ docker -v
Docker version 1.10.0, build 590d5108
$ docker-compose -v
docker-compose version 1.5.2, build 7240ff3
The load average is high in both cases (about 20), but it's not disc-bound.
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- ---system--- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
22 0 0 8630704 71160 257040 0 0 29 6 177 614 3 1 94 1 0
7 0 0 8623252 71160 257084 0 0 0 16 9982 83401 46 12 43 0 0
43 0 0 8618844 71160 257088 0 0 0 0 9951 74056 52 10 38 0 0
17 0 0 8612796 71160 257088 0 0 0 0 10143 70098 52 14 34 0 0
17 0 0 8606756 71160 257092 0 0 0 0 11324 70113 48 15 37 0 0
31 0 0 8603748 71168 257104 0 0 0 32 9907 85295 44 12 41 3 0
21 0 0 8598708 71168 257104 0 0 0 0 9895 69090 52 11 36 0 0
22 0 0 8594316 71168 257108 0 0 0 0 9885 68336 53 12 35 0 0
31 0 0 8589564 71168 257124 0 0 0 0 10355 82218 44 13 43 0 0
It's also not network-bound. Even if I disable sending data to remote host and all communications are within the machine - I still see 55-65%.
The setup for docker and compose are default, nothing tweaked.
Why can't I saturate CPUs when it's running inside Docker? Is there some hidden limit in Docker? How do I discover this limitation?
cpuset_cpus:0,1,2,3,4,5,6,7
and/or cpu_shares: 102400
(100 times the default) doesn't seem to change the situation.
There is also nothing interesting about limitations in /var/log/*
It is also not the docker bridge
network. The effect is the same when I use net: host
in Docker Compose
If I run second container with same code with different port exposed - I can get the CPU load up to 77%, but still not 100% like on bare-metal. Note that each of those containers run 20-40 processes load-balanced with Passenger inside.
Ok, it seems to have something to do with Ubuntu. The same container ran on CoreOS - I'm able to saturate all cores.
But I still don't understand the limitation.
To be completely fair I took 2 identical 16GB 8CPU instances on DigitalOcean, both in Frankfurt datacenter. I installed app on most recent Ubuntu and most recent CoreOS alpha.
CoreOS 949.0.0: Docker version 1.10.0, build e21da33
Ubuntu 14.04.3: Docker version 1.10.0, build 590d5108
I'm not sure how to get exactly the same builds - it seems that CoreOS has Docker builtin and read-only FS and with Ubuntu - I have no idea how to get build exactly e21da33. But the general version is the same 1.10.0
I run ab
from external machine on DigitalOcean also in Frankfurt datacenter to ensure that ab
is not a variation. I hit the external IP in both cases. The parameters for ab
are the same (ab -n 40000 -c 1000 -k
), the code is the same.
The results:
Ubuntu: 58-60% CPU 1162.22 [#/sec] (mean)
CoreOS: 100% CPU 4440.45 [#/sec] (mean)
This starts to get really weird.
To give Ubuntu some chance I also tried adding:
security_opt:
- apparmor:unconfined
But that didn't change much.
Ubuntu 14.04.3 NOT OK (50-60% CPU)
Ubuntu 15.10 NOT OK (50-60% CPU)
Debian 8.3 NOT OK (50-60% CPU)
CentOS 7.2.1511 OK (100% CPU)
CoreOS 949.0.0 OK (100% CPU)
Still have no idea what the limitation is. Seems to be Debian-related.
Upvotes: 40
Views: 3617
Reputation: 1
Not sure that this found related to this topic but I have struggled with this a lot.
I had an issue with the number of cores used by docker. Docker used only 60 of 128 cores. Interesting that also docker info
showed me 128 cores available and then I started exploring the cgroups and found that /sys/fs/cgroup/cpuset/docker/cpuset.cpu
was 0-59 instead of 0-127 as in /sys/fs/cgroup/cpuset/cpuset.cpu
.
After I changed that to 0-127 my problem dissapeared.
So correlated to you problem you may want to explore the cpu quota or period.
Upvotes: 0
Reputation: 444
We had the same problem, started diving in and found this: https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
You can specify --cpu-quota
to Docker, and you want it to correspond to number of CPUs you wish to use.
For example, if you want the container to be able to use 4 CPUs you should set it to 400000
; if you want it unconstrained completely, specify -1
.
Worked for us.
Upvotes: 0
Reputation: 1329
Starting Docker with systemd fixed that issue for me (Unbuntu 16.04). All of my 12 threads are used at 100% in a single container when benchmarking.
Stop Docker service:
sudo service docker stop
And start it with systemctl:
sudo systemctl start docker
To start Docker at boot:
sudo systemctl enable docker
Upvotes: 3
Reputation: 678
Please don't get excited (or flame me) - this is not the answer - I just need more space than a comment will allow! I am not a linux or Docker expert but I really like this sort of problem and have done some research over the weekend and have a few avenues to explore that may help. I don't have a test rig so have reached an impasse.
Theories so far "For Debian and Ubuntu...":
Docker is putting container and sub-processes into a cgroup that is being throttled in some way.
The scheduler for the OS and the scheduler within the Docker container (systemd?) are in some way 'fighting' for the CPU and constantly displacing each other.
The OS scheduler is treating (a) the Docker Container and (b) the app inside as separate competing resource requests and therefore giving each about 50%
It seems to me that the RedHat flavours of linux have in some way 'integrated' docker (read "looked at what it does and tweaked their OS setup or Docker setup to be compatible"). What have they changed to do this? - it may be the thing that makes the difference.
There is a strong push for not using Docker under RHEL 6 but instead to use RHEL 7+ - What did they change in RH between these versions wrt. the CPU scheduling that makes them so keen on using 7+?
What I'd look at next:
Research:
https://goldmann.pl/blog/2014/09/11/resource-management-in-docker/
http://www.janoszen.com/2013/02/06/limiting-linux-processes-cgroups-explained/
https://github.com/docker/docker/issues/6791
https://github.com/ibuildthecloud/systemd-docker/issues/15
https://unix.stackexchange.com/questions/151883/limiting-processes-to-not-exceed-more-than-10-of-cpu-usage
http://linux.die.net/man/5/limits.conf
https://marketplace.automic.com/details/centos-official-docker-image
https://www.datadoghq.com/blog/how-to-monitor-docker-resource-metrics/
https://libraries.io/go/github.com%2Fintelsdi-x%2Fsnap-plugin-collector-docker%2Fdocker
https://serverfault.com/questions/356962/where-are-the-default-ulimit-values-set-linux-centos
https://www.centos.org/forums/viewtopic.php?t=8956
https://docs.mongodb.org/manual/reference/ulimit/
http://www.unixarena.com/2013/12/how-to-increase-ulimit-values-in-redhat.html
If none of this helps I apologise!
Upvotes: 8