Reputation: 637
We are facing an issue where we are using kafka persistent state store and its often runs out of space (8gb) so we are thinking of moving to in memory state store
Stores.persistenKeyValueStore("name");
To
Stores.inMemoryKeyValueStore("name");
Just have few question if we change to in memory
Is there any other disadvantage of switching to in memory.
Please note that we have streaming applications (KTable) and have around 2M unique messages.
Size of each msg would be around 2kb Frequency 500msg/sec avg
Upvotes: 5
Views: 6103
Reputation: 62285
runs out of space (8gb) so we are thinking of moving to in memory state store
Seems that switching to in-memory stores would be a step backwards? 8GB is also rather smaller -- why do you have such small disks?
Do we loose any data in case of broker/consumer restart?
No. Persistent stores are just an optimization for increase startup times and the ability to hold larger state (as they can spill to disk). Both, persistent and in-memory stores are backed by a changelog topic in the Kafka cluster for fault-tolerance. For proper fault-tolerance you need to apply the same config on Kafka Streams as well as the changelog topic independent of the store type.
How does consumer get the previous data in case the old data from memory is flushed, does it get that data from broker?
If you use in-memory store, the client always holds a full copy of the data set. Hence, you data set must fit into main-memory. The write to the Kafka cluster are for fault-tolerance only. During normal operations, Kafka Streams only writes into the changelog topics. Changelog topics are only read if a task is migrated and the store needs to be rebuild.
Is there any other disadvantage of switching to in memory.
As mentioned, the disadvantages are: - you loose the local state of rolling restart and state needs to be recovered from the changelog topic increasing startup time - your state must fit into main-memory
Upvotes: 5