GreenThumb
GreenThumb

Reputation: 523

Docker healthcheck reporting "healthy" always

I want to be able to report "Unhealthy" when a container becomes so (based on various conditions), for now I just return 500 on an even call and 200 OK on a odd numbered call.

My docker file looks like so:

FROM golang:alpine

RUN apk update
RUN apk add curl
RUN mkdir /service
COPY healthcheck.go /service
COPY ./counts /service

EXPOSE 9080

WORKDIR /service

HEALTHCHECK --interval=5s --timeout=500ms CMD curl --fail http://localhost:9080/health || exit 1

CMD ["go", "run", "/service/healthcheck.go"]               

With docker inspect I am able to see that there are timeouts(induced from code) and status Ok's. However the "Health.Status" in the inspect shows

"Status": "healthy"

docker inspect output:

        "Health": {
            "Status": "healthy",
            "FailingStreak": 1,
            "Log": [
                {
                    "Start": "2018-03-10T02:44:12.48947433Z",
                    "End": "2018-03-10T02:44:12.99252883Z",
                    "ExitCode": -1,
                    "Output": "Health check exceeded timeout (500ms)"
                },
                {
                    "Start": "2018-03-10T02:44:18.004402431Z",
                    "End": "2018-03-10T02:44:18.069316531Z",
                    "ExitCode": 0,
                    "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\nThis time it has to be healthy 252\n\r100    43  100    43    0     0  43000      0 --:--:-- --:--:-- --:--:-- 43000\nnext253"
                },
                {
                    "Start": "2018-03-10T02:44:23.078242333Z",
                    "End": "2018-03-10T02:44:23.583552633Z",
                    "ExitCode": -1,
                    "Output": "Health check exceeded timeout (500ms)"
                },
                {
                    "Start": "2018-03-10T02:44:28.593083534Z",
                    "End": "2018-03-10T02:44:28.665864034Z",
                    "ExitCode": 0,
                    "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\r100    43  100    43    0     0   7166      0 --:--:-- --:--:-- --:--:--  8600\n\nThis time it has to be healthy 254\nnext255"
                },
                {
                    "Start": "2018-03-10T02:44:33.671220836Z",
                    "End": "2018-03-10T02:44:34.177248436Z",
                    "ExitCode": -1,
                    "Output": "Health check exceeded timeout (500ms)"
                }
            ]
        }
    },

Any pointers how to report the container as Unhealthy?

Upvotes: 0

Views: 5016

Answers (3)

MrHetii
MrHetii

Reputation: 1495

Time for a bit of magic without curl or any other external stuff:

There is a differance between ubuntu like 'nc' and busybox 'nc' version used in alpine image.

The point is that regular nc wait for response and this one from busybox seams to not.

Because of that I use { ... } to encapsulate 'printf' and 'sleep' into single subshell that is piped next to nc.

By doing that, nc have a chance to get response from endpoint and pipe it out to grep.

Exit status of grep decide about healty status.

HEALTHCHECK --interval=1s --timeout=5s --retries=3 \ CMD { printf "GET /fpm-ping HTTP/1.0\r\n\r\n"; sleep 0.5; } | nc -w 1 127.0.0.1 8080 | grep pong

Upvotes: 2

Jay Lim
Jay Lim

Reputation: 412

Yes, you can allow docker to report the container as unhealthy by changing your HEALTHCHECK in Dockerfile to the one below:

HEALTHCHECK --interval=5s --retries=1 --timeout=500ms CMD curl --fail http://localhost:9080/health || exit 1

If a single run of the check takes longer than timeout seconds then the check is considered to have failed.

It takes retries consecutive failures of the health check for the container to be considered unhealthy.

(Ref: https://docs.docker.com/engine/reference/builder/#healthcheck)

By default, docker will attempt to retry for 3 times and when it fails for three consecutive times, then the container is considered to be unhealthy. At the moment, you return status 500 on an even numbered request and status 200 on an odd numbered request. When it fails (on the even numbered request), docker will retry again, and this time it will be an odd numbered request, so it reports the container as healthy.

By setting retries to 1, docker will report the container as unhealthy when the first attempt fails, and wait for 5 seconds to attempt the healthcheck again.

Upvotes: 1

GreenThumb
GreenThumb

Reputation: 523

Turns out --retries was the solution.

Changed Dockerfile listed here:

FROM golang:alpine

RUN apk update
RUN apk add curl
RUN mkdir /service
COPY healthcheck.go /service
COPY ./counts /service

EXPOSE 9080

WORKDIR /service

HEALTHCHECK --interval=5s --timeout=500ms --retries=1 CMD curl --fail http://localhost:9080/health || exit 1                                              

CMD ["go", "run", "/service/healthcheck.go"]

Upvotes: 0

Related Questions