pavan kalyan
pavan kalyan

Reputation: 107

Presto worker nodes memory consumption is gradually increasing over a period of time & eventually its killing the service

Presto version: 0.278.1

Presto worker nodes memory consumption is gradually increasing over a period of time & eventually its killing the service.

Hi, we are using Presto with the below configs:

One coordinator & 3 workers.

Coordinator configs:

coordinator=true node-scheduler.include-coordinator=true http-server.http.port=49152 node-scheduler.max-splits-per-node=2500 query.max-run-time=5m #task.max-worker-threads=24 query.max-memory=12GB query.max-memory-per-node=2GB discovery-server.enabled=true discovery.uri=http://localhost:49152 join-distribution-type=BROADCAST optimizer.join-reordering-strategy=NONE

worker configs:

coordinator=false http-server.http.port=49001 node-scheduler.max-splits-per-node=2500 query.max-run-time=5m #task.max-worker-threads=64 query.max-memory=12GB query.max-memory-per-node=2GB #discovery-server.enabled=true discovery.uri=http://:49152

Looks like the cache is building up as soon as the query is executed & we get the result. Is there a way to clear this cache & get rid of this memory build up issue?

We are trying the below thing.

Automatic Cache Reset with Presto Configuration Settings

While Presto doesn't support direct cache invalidation, you can configure some parameters to manage memory and reduce the impact of caching: Memory Spill and Query Timeouts If you want Presto to handle memory more dynamically and avoid excessive memory usage, enable spilling and set query timeouts: Enable Spilling (if not already enabled):

experimental.spill-enabled=true

query.max-run-time=5m

But looks like it is not a sophisticated way.

We are currently re-launching the service whenever there is a high memory usage. Could you suggest a better approach?

#presto #meta

Upvotes: 0

Views: 49

Answers (1)

JY One
JY One

Reputation: 70

It's normal for memory to increase and stay used as data will fill up the caches over time but not to reach an OOM . Presto internally has many caches - it's possible some the caches are being filled during execution which increases the memory utilization. They should be flushed when memory is needed. Additionally, it could also just be JVM garbage collection not kicking it to reduce the heap footprint.

Take a look at the memory management properties in the configuration reference: https://prestodb.github.io/docs/current/admin/properties.html#memory-management-properties as well as the "Memory Limits" section from this blog post: https://prestodb.io/blog/2019/08/19/memory-tracking/

You may want to play around with the properties query.max-total-memory-per-node and/or memory.heap-headroom-per-node

Upvotes: 0

Related Questions