Reputation: 1631
I want to run an apache flink (1.11.1) streaming application on kubernetes. With a filesystem state backend saving to s3. Checkpointing to s3 is working
args:
- "standalone-job"
- "-s"
- "s3://BUCKET_NAME/34619f2862ce3e5fc91d80eae13a434a/chk-4/_metadata"
- "--job-classname"
- "com.abc.def.MY_JOB"
- "--kafka-broker"
- "KAFKA_HOST:9092"
ListState<String>
whenever I deploy via Gitlab a newer version of my application it again throws this event.env.enableCheckpointing(Duration.ofSeconds(60).toMillis());
and env.getCheckpointConfig().enableExternalizedCheckpoints(RETAIN_ON_CANCELLATION);
Upvotes: 3
Views: 1126
Reputation: 115
there are several ways to deploy workloads to kubernetes, simple YAML files, Helm Chart, and Operator.
Upgrading a stateful Flink job is not as simple as upgrading a stateless service, you only need to update the binary file and restart.
Upgrading Flink Job you need to take a savepoint or get the latest checkpoint dir and then update binary and finally resubmit your job, in this case, I think simple YAML files and Helm Chart cannot help you to achieve this, you should consider implementing a Flink Operator to do the upgrading job.
https://github.com/GoogleCloudPlatform/flink-on-k8s-operator
Upvotes: 0
Reputation: 43524
execution.checkpointing.interval: 60000
execution.checkpointing.externalized-checkpoint-retention: RETAIN_ON_CANCELLATION
Upvotes: 2