Paul Draper
Paul Draper

Reputation: 83323

Why isn't this script killing Docker background process?

I've read How do I kill background processes / jobs when my shell script exits?, but I can't get it to work.

IDK if it's Docker shenanigans or something else.

#!/bin/bash -e
base="$(dirname "$0")"

trap 'kill $(jobs -p)' SIGINT SIGTERM EXIT

docker run --rm -p 5432:5432 -e POSTGRES_PASSWORD=password postgres:12 &

while ! nc -z localhost 5432; do
  sleep 0.1
done

# uh-oh, error
false

When I run this, I am left with a running Docker container.

Why? How can stop the process when my script exits?

Upvotes: 1

Views: 1536

Answers (2)

BMitch
BMitch

Reputation: 264306

Docker is a client/server application, consisting of a thin client, docker, and server, dockerd. When you run a container, the client makes a few API calls to the server, one to create the container, another to start it, and since you didn't run it detached, it runs an attach API. When you kill the docker process, it detaches from the container, no longer showing you the logs, and kills that client portion. But the dockerd server is still running the container until process inside the container, running as pid 1 inside the container namespace, exits. You never killed that process since it's spawned from the dockerd daemon, not directly from the docker client.

To fix this, my suggestion is to run a docker stop, with the container name or id, as part of your trap handler. I wouldn't even bother running docker in the background, and instead pass -d to run detached.


Follow up, testing the script locally, it looks like killing the docker client does send a docker stop signal when you run the client attached like that. However, there's a race condition that can cause that stop to happen before the database is running. The command:

nc -z localhost 5432

is always going to succeed even before postgresql starts listening on the port because docker creates a port forward. E.g.:

$ nc -z localhost 5432 && echo it works

$ docker run -itd --rm -p 5432:5432 busybox tail -f /dev/null
c72427053124608fe18c31e5d6f3307d74a5cdce018503e9fff85dbc039b4fff

$ nc -z localhost 5432 && echo it works
it works

$ docker stop c72
c72

$ nc -z localhost 5432 && echo it works

However, if I run a sleep in the script, that forces it to wait long enough for the container to finish starting up, and the attach to finish, the container is stopped.

A better version of the script looks like the following, that waits for the database to completely start by checking the logs, and changing the trap to run a docker stop command:

#!/bin/bash -e
base="$(dirname "$0")"

trap 'kill $(jobs -p)' SIGINT SIGTERM EXIT

cid=$(docker run --rm -d -p 5432:5432 -e POSTGRES_PASSWORD=password postgres:12)

# leaving the kill assuming you have other background processes
trap 'docker stop $cid; kill $(jobs -p)' SIGINT SIGTERM EXIT

# waiting for the db to actually start, assuming later steps need the db to be up
while ! docker logs "$cid" 2>&1 | grep -q "database system is ready to accept connections" ; do
  sleep 0.1
done

# uh-oh, error
false

Upvotes: 3

Paul Draper
Paul Draper

Reputation: 83323

It was Docker shenanigans.

I needed to use the --init option to run tini shim because

A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. As a result, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so.

docker run --rm -p 5432:5432 -e POSTGRES_PASSWORD=password postgres:12 &

Upvotes: 1

Related Questions