Attila Szeremi
Attila Szeremi

Reputation: 5478

Docker containers gone after a while in CoreOS

I have a couple of small projects running on CoreOS beta (899.5.0)

This has happened to me a second time now, that I have 3 containers running. All fine and dandy. Then after a few days of not looking at my websites on that server, I notice that the pages are down when I try to visit them.

And when I log into my CoreOS machine on Digital Ocean and type docker ps, I notice all my containers are gone! This is insane.

I even have a systemd service set up for a couple of them so that if they terminate for whatever reason, that they should run again. But they don't.

I do see this though greeting me when I log in; I'm not sure if it has something to do with it:

Last login: Sun Jan 17 23:42:37 2016 from 81.106.109.70
CoreOS beta (899.5.0)
Failed Units: 13
  [email protected]:22-219.219.114.120:14536.service
  [email protected]:22-219.219.114.120:30158.service
  [email protected]:22-219.219.114.120:17539.service
  [email protected]:22-122.224.34.168:1397.service
  [email protected]:22-122.224.34.168:3789.service
  [email protected]:22-122.224.34.168:2983.service
  [email protected]:22-219.219.114.120:51826.service
  [email protected]:22-219.219.114.120:38882.service
  [email protected]:22-219.219.114.120:34654.service
  [email protected]:22-219.219.114.120:21256.service
  [email protected]:22-219.219.114.120:39645.service
  [email protected]:22-219.219.114.120:63277.service
  [email protected]:22-219.219.114.120:37294.service

I couldn't find out any information about this happening on CoreOS whatsoever on Google. Please, any help is appreciated!

P.S. my systemd config looks like this:

szeremi.service

[Unit]
Description=Run %p
Requires=docker.service
After=docker.service

[Service]
Restart=always
ExecStartPre=-/usr/bin/docker kill %p
ExecStartPre=-/usr/bin/docker rm -f %p
ExecStart=/usr/bin/docker run -t --rm --name %p \
  -p 80:8080 \
  amcsi/szeremi
ExecStop=/usr/bin/docker stop %p

[Install]
WantedBy=multi-user.target

EDIT: The latest page of the logfile (journalctl -u szeremi.service) is: https://gist.github.com/amcsi/95c8b0eb71de2f44c16b#file-journalctl-u-szeremi-service

Upvotes: 1

Views: 189

Answers (2)

Ben Campbell
Ben Campbell

Reputation: 4678

CoreOS uses systemd to define processes on the host. The service file you have exemplified is a service unit which defines how a docker container will be run. If you are using systemd directly by logging into the CoreOS node directly you must enable the service in order for it to persist across a boot. The enable sub command of the systemctl program takes a unit as a parameter and creates a symlink within the systemd target file structure. When systemd is booting it starts any service within a given target. This ensures a given service starts at boot.

systemctl enable szeremi.service   # Creates a symlink within systemd 
systemctl start szeremi.service    # Starts the service => runs container

Upvotes: 1

Josh Wood
Josh Wood

Reputation: 21

The failed sshd units reported at login are the result of failed login attempts against the systemd socket-activated sshd on CoreOS. They're not related to the missing docker containers.

It seems likely that the containers are "disappearing" after the system reboots to perform an automatic update. This is the default setting for updates, and those updates happen with some frequency on the beta channel. You can check how recently the system was rebooted with uptime or some similar command.

If the system has been rebooted and the containers have not been restarted afterward, you can focus troubleshooting on the systemd unit file and service startup. If it hasn't rebooted, the container logs (e.g., docker logs szeremi) and the service's logs (e.g., systemctl status -l szeremi.service) are possible places to begin. #coreos on freenode or the coreos-users mailing list will both have folks ready to help in either case.

Upvotes: 2

Related Questions