JVM Metaspace utilization keeps increasing in apache-flink

Question

Requirement: The Flink cluster (including the JobManager and TaskManagers) needs to operate continuously 24/7 to ensure that Flink jobs can be submitted and run without interruption.

Issue: The JVM Metaspace of both the JobManager and TaskManager continues to increase with each Flink job execution. This issue causes the Flink cluster to become unresponsive when it reaches approximately 95%, resulting in Flink job failures. I tested this with running a simple wordcount submitting it thousands of times where this cant have classes or classloaders to hang on to in flink Metaspace and yet it continues to increase.

I would appreciate any suggestions for potential solutions to this problem.

The only workaround currently available is to frequently restart the Flink cluster, but this is not the solution I am seeking.

JVM Metaspace utilization keeps increasing in apache-flink

Answers (1)

Related Questions