Mohammad Hossein Gerami
Mohammad Hossein Gerami

Reputation: 1388

Checkpoint Shared is Very Large

I running my flink app with 16 parallelism. after 20 minutes shared checkpoint increase to 235MB. how i can i handle it. it's very large in long time.

enter image description here

Upvotes: 0

Views: 526

Answers (1)

David Anderson
David Anderson

Reputation: 43697

Flink will use only as much space for state as is required to do what you've asked it to do. If you are unhappy with the result, you need to somehow ask it to do less.

Here some things you might do:

  • Make sure your application isn't leaking state. This can happen, for example, if you are using keyed state with an unbounded key space, and aren't clearing the state.
  • Establish a state retention interval (for the Table/SQL API).
  • Use State TTL to free unneeded state.

There are certain anti-patterns that require a lot of buffering in state. You should avoid those. :)

You could restrict the resources available for storing state, but this will result in the job failing when those resources are exhausted.

Also, 235MB across 16 slots isn't very large for RocksDB. With incremental checkpointing, RocksDB is storing multiple (uncompacted) copies of the state. The actual active state you're using could be much less.

Upvotes: 1

Related Questions