xRobot
xRobot

Reputation: 26567

Is it safe to delete the cleaner-offset-checkpoint file to force the compaction?

I need a way to force the compaction of the __consumer_offsets topic. In a test environment I tried to delete the file cleaner-offset-checkpoint and then kafka deleted many segments as you can see below. Is it safe to delete this file in a production environment?

Before removing cleaner-offset-checkpoint:

# ls -la data/kafka/__consumer_offsets-*/*
-rw-r--r--. 1 root root        0 14 giu 17.20 data/kafka/__consumer_offsets-0/00000000000000000000.log
-rw-r--r--. 1 root root    35452 14 giu 17.45 data/kafka/__consumer_offsets-0/00000000000000018880.log
-rw-r--r--. 1 root root    38682 14 giu 17.54 data/kafka/__consumer_offsets-0/00000000000000021214.log
-rw-r--r--. 1 root root        0 14 giu 17.16 data/kafka/__consumer_offsets-10/00000000000000000000.log
-rw-r--r--. 1 root root        0 14 giu 17.20 data/kafka/__consumer_offsets-1/00000000000000000000.log
-rw-r--r--. 1 root root    35452 14 giu 17.37 data/kafka/__consumer_offsets-10/00000000000000018880.log
-rw-r--r--. 1 root root    74650 14 giu 17.55 data/kafka/__consumer_offsets-10/00000000000000021214.log
-rw-r--r--. 1 root root    35452 14 giu 17.44 data/kafka/__consumer_offsets-1/00000000000000018880.log
-rw-r--r--. 1 root root    47674 14 giu 17.54 data/kafka/__consumer_offsets-1/00000000000000021214.log
-rw-r--r--. 1 root root        0 14 giu 17.14 data/kafka/__consumer_offsets-11/00000000000000000000.log
-rw-r--r--. 1 root root    33325 14 giu 17.40 data/kafka/__consumer_offsets-11/00000000000000018781.log
-rw-r--r--. 1 root root    62264 14 giu 17.55 data/kafka/__consumer_offsets-11/00000000000000021119.log
-rw-r--r--. 1 root root        0 14 giu 17.12 data/kafka/__consumer_offsets-12/00000000000000000000.log
-rw-r--r--. 1 root root    35452 14 giu 17.37 data/kafka/__consumer_offsets-12/00000000000000018727.log
-rw-r--r--. 1 root root    74650 14 giu 17.55 data/kafka/__consumer_offsets-12/00000000000000021061.log
-rw-r--r--. 1 root root        0 14 giu 17.18 data/kafka/__consumer_offsets-13/00000000000000000000.log
-rw-r--r--. 1 root root    35452 14 giu 17.41 data/kafka/__consumer_offsets-13/00000000000000018880.log
-rw-r--r--. 1 root root    65658 15 giu 09.52 data/kafka/__consumer_offsets-13/00000000000000021214.log
...

After removing cleaner-offset-checkpoint ( as you can see there are less segments per partition ):

# ls -la data/kafka/__consumer_offsets-*/*
-rw-r--r--. 1 root root    35452 14 giu 17.45 data/kafka/__consumer_offsets-0/00000000000000000000.log
-rw-r--r--. 1 root root    38682 14 giu 17.54 data/kafka/__consumer_offsets-0/00000000000000021214.log
-rw-r--r--. 1 root root    35452 14 giu 17.37 data/kafka/__consumer_offsets-10/00000000000000000000.log
-rw-r--r--. 1 root root    35452 14 giu 17.44 data/kafka/__consumer_offsets-1/00000000000000000000.log
-rw-r--r--. 1 root root    74650 14 giu 17.55 data/kafka/__consumer_offsets-10/00000000000000021214.log
-rw-r--r--. 1 root root    47674 14 giu 17.54 data/kafka/__consumer_offsets-1/00000000000000021214.log
-rw-r--r--. 1 root root    33325 14 giu 17.40 data/kafka/__consumer_offsets-11/00000000000000000000.log
-rw-r--r--. 1 root root    62264 14 giu 17.55 data/kafka/__consumer_offsets-11/00000000000000021119.log
-rw-r--r--. 1 root root    35452 14 giu 17.37 data/kafka/__consumer_offsets-12/00000000000000000000.log
-rw-r--r--. 1 root root    74650 14 giu 17.55 data/kafka/__consumer_offsets-12/00000000000000021061.log
-rw-r--r--. 1 root root    35452 14 giu 17.41 data/kafka/__consumer_offsets-13/00000000000000000000.log
-rw-r--r--. 1 root root    65658 15 giu 09.52 data/kafka/__consumer_offsets-13/00000000000000021214.log
...

Upvotes: 0

Views: 1796

Answers (1)

nipuna
nipuna

Reputation: 4105

cleaner-offset-checkpoint is in kafka logs directory. This file keeps the last cleaned offset of the topic partitions in the broker like below.

<topic name> <partition number> <last cleaned offset>

So when Kafka start compaction, to calculate dirty ratio, number of clean messages and number of dirty messages to compact is decided by reading this offset.

By cleaning this file, All the logs decided to be as dirty because there is no last cleaned offset in that file. Compaction will run for all topic partition who is leader of those partition. If you need only __consumer_offsets to be compacted, just clean that records belong to __consumer_offsets topic partitions.

But in the broker startup, all brokers will be busy, because of log compaction from start. Other than that, there won't be any issue.

You can refer followings for more understanding

Upvotes: 1

Related Questions