Reputation: 121
I have a Flink session cluster on top of Kubernetes and recently I switched from the ZK based HA to Kubernetes HA.
Reading through
https://cwiki.apache.org/confluence/display/FLINK/FLIP-144%3A+Native+Kubernetes+HA+for+Flink#FLIP144:NativeKubernetesHAforFlink-LeaderElection
I can observe on the Flink namespace the configmaps for each resource as described in the docs above:
k8s-ha-app1-00000000000000000000000000000000-jobmanager 2 4m35s
k8s-ha-app1-dispatcher 2 4m38s
k8s-ha-app1-resourcemanager 2 4m38s
k8s-ha-app1-restserver 2 4m38s
However, I don't see a single configmap for the "jobmanager" resource. I see as many, as jobs are run accross the day. This can be a high number, so over days, it implies a huge surge of configmaps in the cluster namespace.
The different HA configmaps for the jobmanager seem to differ both in the
"address": "akka.tcp://flink@flink-jobmanager:6123/user/rpc/jobmanager_XXX"
(where XXX is increasing)
and the "sessionId" value.
Can someone please explain to me on what basis these "jobmanager" resources are created? At the beginning I thought there might be scheduled cleanup round, but I read at the docs that the HA configmaps are stripped from owner and not deleted. Did I miss to set something so that all the jobs are run against the same session, or some way that I can get these k8s-ha-app1-XXXXXXXXXXXXXXXXXXXXX-jobmanager cleaned up after the job runs?
Upvotes: 0
Views: 648
Reputation: 13346
The way Flink works internally is that the Dispatcher
creates for every submitted job a dedicated JobMaster
component. This component needs a leader election and for this purpose it creates a k8s-ha-app1-<JOB_ID>-jobmanager
config map. This is the reason why you see multiple xyz-jobmanager
ConfigMaps being created.
The reason why these ConfigMaps are not cleaned up is that this currently happens only when the whole cluster is shut down. This is a limitation and the Flink community has created FLINK-20695 in order to fix it. The idea is that the JobMaster
related ConfigMaps can be deleted after the job has reached a terminal state.
A bit related is another limitation which hampers the proper clean up in case of a session cluster. If the cluster is shut down with a SIGTERM signal then it is currently not guaranteed that all resources are cleaned up. See FLINK-21008 for more information.
Upvotes: 2