Reputation: 22170
I recently set up healthcheck
s in my docker-compose
config.
It is doing great and I like it. Here's a typical example:
services:
app:
healthcheck:
test: curl -sS http://127.0.0.1:4000 || exit 1
interval: 5s
timeout: 3s
retries: 3
start_period: 30s
My container is quite slow to boot, hence I set up a 30 seconds start_period
.
But it doesn't really fit my expectation: I don't need check every 5 seconds, but I need to know when the container is ready for the first time as soon as possible for my orchestration, and since my start_period
is approximative, if it is not ready yet at first check, I have to wait for interval
before retry.
What I'd like to have is:
Ain't there a way to achieve this out-of-the-box with docker-compose
?
I could write a custom script to achieve this, but I'd rather have a native solution if it is possible.
Upvotes: 12
Views: 6089
Reputation: 22170
I wrote a script that does this, though I'd rather find a native solution:
#!/bin/sh
HEALTHCHECK_FILE="/root/.healthchecked"
COMMAND=${*?"Usage: healthcheck_retry <COMMAND>"}
if [ -r "$HEALTHCHECK_FILE" ]; then
LAST_HEALTHCHECK=$(date -r "$HEALTHCHECK_FILE" +%s)
# FIVE_MINUTES_AGO=$(date -d 'now - 5 minutes' +%s)
FIVE_MINUTES_AGO=$(echo "$(( $(date +%s)-5*60 ))")
echo "Healthcheck file present";
# if (( $LAST_HEALTHCHECK > $FIVE_MINUTES_AGO )); then
if [ $LAST_HEALTHCHECK -gt $FIVE_MINUTES_AGO ]; then
echo "Healthcheck too recent";
exit 0;
fi
fi
if $COMMAND ; then
echo "\"$COMMAND\" succeed: updating file";
touch $HEALTHCHECK_FILE;
exit 0;
else
echo "\"$COMMAND\" failed: exiting";
exit 1;
fi
Which I use: test: /healthcheck_retry.sh curl -fsS localhost:4000/healthcheck
The pain is that I need to make sure the script is available in every container, so I have to create an extra volume for this:
image: postgres:11.6-alpine
volumes:
- ./scripts/utils/healthcheck_retry.sh:/healthcheck_retry.sh
Upvotes: 5
Reputation: 3463
Unfortunately, this is not possible out of the box.
All the duration set are final. They can't be changed depending on the container state.
However, according to the documentation, the probe does not seem to wait for the start_period
to finish before checking your test. The only thing it does is that any failure hapenning during start_period
will not be considered as an error.
Below is the sentence that make me think that :
start_period
provides initialization time for containers that need time to bootstrap. Probe failure during that period will not be counted towards the maximum number of retries. However, if a health check succeeds during the start period, the container is considered started and all consecutive failures will be counted towards the maximum number of retries.
I encourage you to test if this is really the case as I've never really paid any attention if the healthcheck is tested during the start period or not.
And if it is the case, you can probably increase your start_period
if you're unsure about the duration and also increase the interval
in order to find a good compromise.
Upvotes: 8