haggy
haggy

Reputation: 753

Rethinkdb SIGTERM, shutting down

I'm running RethinkDB in Docker. Everything was working REALLY well for a while until we moved to a new datacenter (but im not sure if this is related to the move at all). Here's what is going on.

I start the rethinkdb container and everything runs well for a while. After some amount of time (it varies between an hour or more) I see the following in the Docker logs (highlighted in yellow):

RethinkDB SIGTERM

I have absolutely no idea why it receives a SIGTERM from the system randomly. Any ideas would be greatly appreciated!

EDIT: I am adding the snippet of the logfile for the SIGTERMs. There doesn't seem to be any kind of pattern according to the timestamps.

2015-07-15T16:15:02.888762613 663165.661585s notice: Server got SIGTERM from pid 0, uid 0; shutting down...
2015-07-17T17:02:11.562306701 13322.914561s notice: Server got SIGTERM from pid 0, uid 0; shutting down...
2015-07-19T18:31:12.499022237 96786.220054s notice: Server got SIGTERM from pid 0, uid 0; shutting down...
2015-07-20T13:52:44.493304030 69690.608865s notice: Server got SIGTERM from pid 0, uid 0; shutting down...

EDIT 2: I ran RethinkDB outside of Docker and I see this in the logs: error: worker process failed to resynchronize with main process. Not sure if it's anything to be concerned about. It does not appear to affect the RethinkDB instance at all (all clients remained connected).

2015-07-21T06:53:10.663375859 0.116098s info: Automatically using cache size of 10702 MB
2015-07-21T06:53:10.676277261 0.128998s notice: Listening for intracluster connections on port 29015
2015-07-21T06:53:10.684504354 0.137225s notice: Listening for client driver connections on port 28015
2015-07-21T06:53:10.685485550 0.138206s notice: Listening for administrative HTTP connections on port 8080
2015-07-21T06:53:10.686313405 0.139034s notice: Listening on addresses: 127.0.0.1, 172.17.42.1, 192.151.151.122, ::1, fe80::1879:43ff:fe5e:bdb2%34, fe80::62eb:69ff:fe07:d986%2, fe80::b837:f2ff:fecd:d5cd%4
2015-07-21T06:53:10.686316632 0.139037s notice: Server ready, "0aa312e817ef_nrx" 069ac5b3-9f43-4bbe-9022-c1f006790e99
2015-07-21T06:53:11.558116243 1.010837s error: worker process failed to resynchronize with main process
2015-07-21T06:53:11.558122179 1.010843s notice: A newer version of the RethinkDB server is available: 2.0.4. You can read the changelog at <https://github.com/rethinkdb/rethinkdb/releases>.

EDIT 3: I found another issue here that I think may be the real problem. The rethink adapter (in the app) is keeping the connections to the DB server established which is exhausting the available file descriptors/ports in the system. Here is an example printout of an lsof. NOTE this is only a short list. There are hundreds/thousands of these kept open when multiple people use the system

node    11633 [username_ommitted]  201u    IPv4 0x53153575d33d64bb       0t0      TCP 192.168.1.142:61041->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  202u    IPv4 0x53153575d33fa65b       0t0      TCP 192.168.1.142:61053->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  203u    IPv4 0x53153575dd6a5d8b       0t0      TCP 192.168.1.142:61043->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  204u    IPv4 0x53153575bff6717b       0t0      TCP 192.168.1.142:61044->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  206u    IPv4 0x53153575d33e54bb       0t0      TCP 192.168.1.142:61049->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  207u    IPv4 0x53153575d33ef4bb       0t0      TCP 192.168.1.142:61050->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  208u    IPv4 0x53153575d33f2a4b       0t0      TCP 192.168.1.142:61051->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  209u    IPv4 0x53153575c333a17b       0t0      TCP 192.168.1.142:61054->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  210u    IPv4 0x53153575d33b47fb       0t0      TCP 192.168.1.142:61056->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  211u    IPv4 0x53153575d33de17b       0t0      TCP 192.168.1.142:61057->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  212u    IPv4 0x53153575d33f065b       0t0      TCP 192.168.1.142:61058->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  213u    IPv4 0x53153575bff67a4b       0t0      TCP 192.168.1.142:61059->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  216u    IPv4 0x53153575dd68f31b       0t0      TCP 192.168.1.142:61062->[RETHINK_IP]:28015 (ESTABLISHED)
node    11633 [username_ommitted]  217u    IPv4 0x53153575dd675a4b       0t0      TCP 192.168.1.142:61063->[RETHINK_IP]:28015 (ESTABLISHED)

Upvotes: 3

Views: 592

Answers (1)

mwakerman
mwakerman

Reputation: 138

I had the exact same issue and believe it to be caused by not closing database connections (as you note in your EDIT 3). I'm using RethinkDB in a expressjs app and followed there middleware example from here but I always terminate the request-response cycle within my controllers without calling next(), meaning the closeConnection middleware was never being reached.

Upvotes: 0

Related Questions