Reputation: 64875
The default durability model of Kafka is to write "in memory" to multiple brokers at which point the produce request is acknowledged. Writes (and fsync) to durable storage comes later, and happens asynchronously.
It is sometimes recommended to set log.flush.interval.messages=1
(or equivalently the flush.interval=1
topic-specific property) in order to get more immediate durability to disk.
My question is: does this setting guarantee that the records are written durably to disk before the produce request is acknowledged, or does it simply mean that every time some records are written the write process is "kicked off", asynchronously, but that the produce is still acknowledged before the write completes (and so there is a window in which a acked produce request may be lost if multiple hosts are lost simultaneously even if their disks are intact).
Upvotes: 1
Views: 59
Reputation: 7940
No ultimately the OS will decide when the write cache is flushed to the disk.
flush.interval=1 tells the OS to add the data to the cache to be written.
Upvotes: 0