Celery and RabbitMQ eventually stopping due to memory exhaustion

Question

I have Celery-based task queue with RabbitMQ as the broker. I am processing about 100 messages per day. I have no backend set up.

I start the task master like this:

broker = os.environ.get('AMQP_HOST', None)
app = Celery(broker=broker)
server = QueueServer((default_http_host, default_http_port), app)

... and I start the worker like this:

broker = os.environ.get('AMQP_HOST', None)
app = Celery('worker', broker=broker)
app.conf.update(
    CELERYD_CONCURRENCY = 1,
    CELERYD_PREFETCH_MULTIPLIER = 1,
    CELERY_ACKS_LATE = True,
)

The server runs correctly for quite some time, but after about two weeks it suddenly stops. I have tracked the stoppage down to RabbitMQ no longer receiving messages due to memory exhaustion:

Feb 25 02:01:39 render-mq-1 docker/e654ac167b10[2189]: vm_memory_high_watermark set. Memory used:252239992 allowed:249239961
Feb 25 02:01:39 render-mq-1 docker/e654ac167b10[2189]: =WARNING REPORT==== 25-Feb-2016::02:01:39 ===
Feb 25 02:01:39 render-mq-1 docker/e654ac167b10[2189]: memory resource limit alarm set on node rabbit@e654ac167b10.
Feb 25 02:01:39 render-mq-1 docker/e654ac167b10[2189]: **********************************************************
Feb 25 02:01:39 render-mq-1 docker/e654ac167b10[2189]: *** Publishers will be blocked until this alarm clears ***
Feb 25 02:01:39 render-mq-1 docker/e654ac167b10[2189]: **********************************************************

The problem is I cannot figure out what needs to be configured differently to prevent this exhaustion. Obviously somewhere something is not being purged, but I don't understand what.

For instance, after about 8 days, rabbitmqctl status shows me this:

{memory,[{total,138588744},
      {connection_readers,1081984},
      {connection_writers,353792},
      {connection_channels,1103992},
      {connection_other,2249320},
      {queue_procs,428528},
      {queue_slave_procs,0},
      {plugins,0},
      {other_proc,13555000},
      {mnesia,74832},
      {mgmt_db,0},
      {msg_index,43243768},
      {other_ets,7874864},
      {binary,42401472},
      {code,16699615},
      {atom,654217},
      {other_system,8867360}]},

... when it was first started it was much lower:

{memory,[{total,51076896},
      {connection_readers,205816},
      {connection_writers,86624},
      {connection_channels,314512},
      {connection_other,371808},
      {queue_procs,318032},
      {queue_slave_procs,0},
      {plugins,0},
      {other_proc,14315600},
      {mnesia,74832},
      {mgmt_db,0},
      {msg_index,2115976},
      {other_ets,1057008},
      {binary,6284328},
      {code,16699615},
      {atom,654217},
      {other_system,8578528}]},

... even when all the queues are empty (except one job currently processing):

root@dba9f095a160:/# rabbitmqctl list_queues -q name memory messages messages_ready messages_unacknowledged
celery  61152   1   0   1
celery@render-worker-lg3pi.celery.pidbox    117632  0   0   0
celery@render-worker-lkec7.celery.pidbox    70448   0   0   0
celeryev.17c02213-ecb2-4419-8e5a-f5ff682ea4b4   76240   0   0   0
celeryev.5f59e936-44d7-4098-aa72-45555f846f83   27088   0   0   0
celeryev.d63dbc9e-c769-4a75-a533-a06bc4fe08d7   50184   0   0   0

I am at a loss to figure out how to find the reason for memory consumption. Any help would be greatly appreciated.

Daniil Fedotov · Accepted Answer

Logs say that you use 252239992 bytes, which is about 250Mb, which is not so high. How many memory do you have on this machine and what is vm_memory_high_watermark value for rabbitmq? (you can check it by running rabbitmqctl eval "vm_memory_monitor:get_vm_memory_high_watermark().") Maybe you should just increase watermark.

Another option can be making all your queues lazy https://www.rabbitmq.com/lazy-queues.html

Celery and RabbitMQ eventually stopping due to memory exhaustion

Answers (2)

Related Questions