Failed health check in docker container on CloudFoundry: no such file or directory

Question

I'm running docker container in CloudFoundry.

After few days the instance crashed with following error:

Instance became unhealthy: exec failed: container_linux.go:348: starting container process caused "exec: \"/tmp/lifecycle/healthcheck\": stat /tmp/lifecycle/healthcheck: no such file or directory"

Facts:

Health check type is set to "port"
After the crash the app is restarted and is running fine
It happened multiple times in different spaces
It happened also on dev instance which was not processing any requests at a time

Questions:

What is this health check?
Why this check is executed?
How to prevent it?

Daniel Mikusa · Accepted Answer

What is this health check?

The Cloud Foundry platform monitors your application. When it detects that the application has "crashed" it will restart it for you. I put "crashed" in quotes because that's a nebulous term.

The platform defines "crashed" as an app that no longer responds to the health checks sent by the platform. There are three health checks.

The first is a pid based health check, where the platform monitors the process to make sure it continues to run. If the process exits, the platform interprets this as a crash and restarts your app.
The second is a port based health check. With this one, the platform makes sure that your application is listening on the port it has been assigned. As long as the platform can make a TCP connection to that port, your app is deemed to be healthy.
The third is an HTTP based health check. This one actually sends an HTTP request to an endpoint of your application. This has to respond with a successful HTTP status code, otherwise your app is deemed to have crashed.

Every app deployed to CF uses the first health check. Any application with a route bound to it will use either the second or third health check, in addition to the first.

Your application appears to be using the port based health check, which is #2.

Why this check is executed?

This check is done so the platform knows if your app is running properly. If it's not, the platform will attempt to take corrective action by restarting the failing application instance.

If the second or third health check are not run, the platform can only tell if the app is running based on the status of it's pid. This leaves a lot of room for error, where the process can be up but hung or in some other way unable to actually do its work. These additional health checks allow the platform to detect more failure scenarios and automatically correct them.

How to prevent it?

You don't really want to prevent the health check. You can turn it off, but as mentioned previously that could leave your app in a non-functioning state.

If you really want to turn it off, you'd set the health check to "process". This tells the platform to only perform the pid check (i.e. #1) above.

Ex: cf push --health-check-type process

In this case, I'd suggest reaching out to your Cloud Foundry operator to see what is happening. The health check is failing for a reason which appears to be unrelated to your application. They should be able to platform logs to better understand the failures.

Hope that helps!

Failed health check in docker container on CloudFoundry: no such file or directory

Answers (1)

Related Questions