himanshuIIITian
himanshuIIITian

Reputation: 6095

Where is the default checkpoint(s) kept in Apache Flink?

I am a newbie to Apache Flink, and I was going through the Apache Flink's examples. I found that in case of a failure Flink has the ability to restore stream processing from a checkpoint.

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.enableCheckpointing(10000L);

Now, my question is where does Flink keep the checkpoint(s) by default?

Any help is appreciated!

Upvotes: 2

Views: 2481

Answers (2)

Mudit bhaintwal
Mudit bhaintwal

Reputation: 555

Default state back-end is MemoryStateBackend. Means it stores the in flight data in Task manager's JVM and checkpoint it in heap of master(job manger). its good for local debugging but you will loose your checkpoints if job goes down.

Usually for production use FsStateBackend with path to external file systems (HDFS,S3 etc). It stores in flight data in Task manager's JVM and checkpoint it to external file system.

like

env.setStateBackend(new FsStateBackend("file:///apps/flink/checkpoint"));

Optionally a small meta file can also be configured pointing to the state store for High availability.

Upvotes: 2

Fabian Hueske
Fabian Hueske

Reputation: 18987

Flink features the abstraction of StateBackends. A StateBackend is responsible to locally manage the state on the worker node but also to checkpoint (and restore) the state to a remote location.

The default StateBackend is the MemoryStateBackend. It maintains the state on the workers' (TaskManagers') JVM heap and checkpoints it to the JVM heap of the master (JobManager). Hence the MemoryStateBackend does not require any additional configuration or external system and is good for local development. However, it is obviously not scalable and suited for any serious workload.

Flink also provides a FSStateBackend, which holds holds local state also on the workers' JVM heap and checkpoints it to a remote file system (HDFS, NFS, ...). Finally, there's also the RocksDBStateBackend, which stores state in an embedded disk-based key-value store (RocksDB) and also checkpoints to a remote file system (HDFS, NFS, ...).

Upvotes: 6

Related Questions