Reputation: 41
I wonder how Kafka records are numbered/identified. What if I have continues stream of data flooding ?, Don't it get overflow of any kind?.
Upvotes: 0
Views: 387
Reputation: 395
The Apache Kafka stores the records continuously based on the available disk space of your broker. These records are immutable. Each topic has user specified number of partitions. Each partition is a collection of Segments.
Partitions - Collection of Segments
What are Segments ?
Segments is a file with two indexes - starting index and ending index.
Each partition has a segment architecture in which you can specify the offset range of the segment. Once the offset range is full, it then creates a new segment. The Segment in which you are producing the records is known as an Active Segment.
Upvotes: 1
Reputation: 32110
It doesn't overflow. It keeps appending messages to the end of the log. For a single partition you're just limited by the space available on the broker's disk.
A topic can be configured with retention properties, to retain data up to a certain amount of time, or size, or indefinitely.
See https://kafka.apache.org/documentation/#intro_topics
Upvotes: 1