Rahul
Rahul

Reputation: 1807

When kafka purges messages

I have Apache Kafka cluster with retention policy delete and retention period set to 24 hrs. Then I have changed retention period dynamically and set it to 1 minute for some specific topic. But old messages are still there, so I have several questions:

  1. What is the trigger point for retention? I assume that though some explicit time to live set for messages, it is not guaranteed that messages will be deleted exactly after this time. So what is the process? (Can't find anything in the reference)
  2. If I change retention period in runtime, will the old messages obey it. As far as I understand retention period is topic-wide property and should work as well for messages, which were published with the first retention period.

Upvotes: 4

Views: 1652

Answers (2)

fhussonnois
fhussonnois

Reputation: 1727

On each broker the partitions are divided into segment logs. By default a segment will store 1GB of data (log.segment.bytes) of data. In addition, a new log segment is rolled out by default every 7 days (log.roll.hours)

Each broker schedules a cleaner-thread which is responsible for periodically check which segments are eligibled to deletion. By default, the cleaner-thread will run a check every 5 minutes (this can be configured throught the broker config : log.retention.check.interval.ms)

A segment is removable if the most recent message within a log is older than the configured retention period. In addition, the active segment log (the one the broker is currently writing to) can't be deleted

In order to be able to remove a segment log as soon as possible you should configure the log rolling in correlation with you retention period. For example, if your retention period is configured to 24 hours it could be a good id to configured log.roll.hours to 1 hour.

Note that segment deletion can actually happen at different time on each broker as the cleaner threads are scheduled together.

Check specific topic configuration with kafka-configs script:

Example : ./bin/kafka-configs --describe --zookeeper localhost:2181 --entity-type topics --entity-name __consumer_offsets

Upvotes: 4

Gokul Potluri
Gokul Potluri

Reputation: 262

Retention policy is applied on closed segments only. If you segment is still active then the data in that segment wont be purged until closed and new segment is opened.

Upvotes: 2

Related Questions