theonetruejason
theonetruejason

Reputation: 49

Docker internal DNS Slow to resolve internal service names

Posting up this experience (question) in case others run into the same situation...

Working on a multi-container Docker-Compose deployment using several private REST-APIs (in python/Flask). System was deployed on Ubuntu VM hosts, and ran without a problems for weeks. (This was an internal demo setup)

The main GUI suddenly became nearly unresponsive. Every API call was taking so long that it caused random timeouts all over, rendering it unusable. Investigation revealed that API calls were taken up to 10 seconds when one container would try to call an API in another.

The culprit was the /etc/resolv.conf file on the host. DevOps has changed all of the hosts to include a search line in resolv.conf, which got picked up by the Docker containers when it was restarted. This caused each internal API call to try and resolve a service name by searching where it wouldn't be found, timing out, and then trying the next location.

Offending resolv.conf:

    search local.company.com company.com
    nameserver 127.0.0.11
    options ndots:2

This caused trying to connect to http://my-service:12345/api/v1/health to be attempted as my-service.local.company.com (timeout) my-service.company.com (timeout) and finally just my-service (API success).

Answer that worked included below.

Upvotes: 1

Views: 1483

Answers (1)

theonetruejason
theonetruejason

Reputation: 49

Edit: Answering my own question, thanks Chris. :)

To fix this problem, I created a local clean version of the resolv.conf file as such:

nameserver 127.0.0.11
options ndots:0

I then created a volume in the container entries for docker-compose to mount ./resolv.conf:/etc/resolv.conf. This overrode the resolve.conf inherited from the host, and all internal service names now resolve quickly with no timeouts or delay.

my-service:
  image: foo:latest
  container_name: "priv_api" 
  volumes:
    # Force the container to use the clean file instead of inherited.
    - ./resolv.conf:/etc/resolv.conf
  networks:
    - nodeapp-network

The fun part of this bug was discovering that the change was made to the host machine 2 weeks ago, and didn't affect the docker containers until they were restarted.

Upvotes: 3

Related Questions