Reputation: 5178
I pushed a message that was too big into a kafka message topic on my local machine, now I'm getting an error:
kafka.common.InvalidMessageSizeException: invalid message size
Increasing the fetch.size
is not ideal here, because I don't actually want to accept messages that big.
Upvotes: 268
Views: 366297
Reputation: 53
I have read almost all of the answers, we are using Kafka Kraft 3.4.0. so I can maybe add one answer for Kraft. It is not really different how to do this on Kraft, you will need a machine which can use the bootstrap servers of kafka with kafka binaries on it do:
bin/kafka-configs.sh --bootstrap-server :9092 --entity-type topics --entity-name your-topic --alter --add-config retention.ms=1000
The problem here is that log retention as time is not really the only thing kafka looks when deleting logs from the filesystem. You also need to consider the log segment bytes. Kafka rolls up segments when the log sizes on disk reach to segment.bytes for the partition in hand, if you have an open partition offset still haven't rolled up it is not going to be deleted even if you set the retention.ms to 1 milisecond.
If you are looking for a way to clear a topic with lets say messages are 2000 bytes each;
set segment.bytes:
bin/kafka-configs.sh --bootstrap-server :9092 --entity-type topics --entity-name your-topic --alter --add-config segment.bytes=<smaller than 1 message's total bytes>
set retention.ms:
bin/kafka-configs.sh --bootstrap-server :9092 --entity-type topics --entity-name your-topic --alter --add-config retention.ms=1000
And keep in mind it is NOT going to clear magically in 1 second, the delete retention period should be triggered in a second BUT the rollups of open ended segments will take more than that (close to 5 mins). So keep an eye on the log sizes on the brokers and reset these configs when you see log sizes are 0 for the topic:
/bin/kafka-configs.sh --bootstrap-server :9092 --entity-type topics --entity-name your-topic --delete-config segment.bytes
/bin/kafka-configs.sh --bootstrap-server :9092 --entity-type topics --entity-name your-topic --delete-config retention.ms
Upvotes: 1
Reputation: 1345
you have to enable this on config
echo "delete.topic.enable=true" >> /opt/kafka/config/server.properties
sudo systemctl stop kafka
sudo systemctl start kafka
purge the topic
/opt/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic flows
create the topic
# /opt/kafka/bin/kafka-topics.sh --create --bootstrap-server localhost:2181 --replication-factor 1 --partitions 1 --topic Test
read the topic
# /opt/kafka/bin/kafka-console-consumer.sh localhost:9092 --topic flows --from-beginning
Upvotes: 3
Reputation: 3127
Besides updating retention.ms and retention.bytes, I noticed topic cleanup policy should be "delete" (default), if "compact", it is going to hold on to messages longer, i.e., if it is "compact", you have to specify delete.retention.ms also.
$ ./bin/kafka-configs.sh --zookeeper localhost:2181 --describe --entity-name test-topic-3-100 --entity-type topics
Configs for topics:test-topic-3-100 are retention.ms=1000,delete.retention.ms=10000,cleanup.policy=delete,retention.bytes=1
Also had to monitor earliest/latest offsets should be same to confirm this successfully happened, can also check the du -h /tmp/kafka-logs/test-topic-3-100-*
$ ./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list "BROKER:9095" --topic test-topic-3-100 --time -1 | awk -F ":" '{sum += $3} END {print sum}'
26599762
$ ./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list "BROKER:9095" --topic test-topic-3-100 --time -2 | awk -F ":" '{sum += $3} END {print sum}'
26599762
The other problem is, you have to get current config first so you remember to revert after deletion is successful:
./bin/kafka-configs.sh --zookeeper localhost:2181 --describe --entity-name test-topic-3-100 --entity-type topics
Upvotes: 4
Reputation: 565
Sometimes, if you've a saturated cluster (too many partitions, or using encrypted topic data, or using SSL, or the controller is on a bad node, or the connection is flaky, it'll take a long time to purge said topic.
I follow these steps, particularly if you're using TLS.
1: Run with kafka tools :
kafka-configs.sh --alter --entity-type topics --zookeeper zookeeper01.kafka.com --add-config retention.ms=1 --entity-name <topic-name>
2: Run:
kafka-console-consumer --consumer-property security.protocol=SSL --consumer-property ssl.truststore.location=/etc/schema-registry/secrets/trust.jks --consumer-property ssl.truststore.password=password --consumer-property ssl.keystore.location=/etc/schema-registry/secrets/identity.jks --consumer-property ssl.keystore.password=password --consumer-property ssl.key.password=password --bootstrap-server broker01.kafka.com:9092 --topic <topic-name> --new-consumer --from-beginning
3: Set topic retention back to the original setting, once topic is empty.
kafka-configs.sh --alter --entity-type topics --zookeeper zookeeper01.kafka.com --add-config retention.ms=604800000 --entity-name <topic-name>
Hope this helps someone, as it isn't easily advertised.
Upvotes: 5
Reputation: 209
Just in case someone is looking for an updated answer (in 2022), I found the following will work for Kafka version 3.3.1. This will change the configuration for "your-topic" so that messages are retained for 1000ms. After messages are purged, then you can set back to a different value.
bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type topics --entity-name your-topic --alter --add-config retention.ms=1000
Upvotes: -1
Reputation: 5907
I'm using Kafka 2.13 tools. Now --zookeeper is unrecognized option for kafka-topics.sh . To delete a topic:
bin/kafka-topics.sh --bootstrap-server [kafka broker]:9092 --delete --topic [topic name]
Just take into account that to create the same topic again you may need to way a while if you had a lot of data in the deleted topic. When you try to create the same topic, you may get the error:
ERROR org.apache.kafka.common.errors.TopicExistsException: Topic '[topic name]' is marked for deletion.
Upvotes: 0
Reputation: 55589
The workaround of temporarily reducing the retention time for a topic, suggested by user644265 in this answer still works but recent versions of kafka-configs
will warn that the --zookeeper
option has been deprecated:
Warning: --zookeeper is deprecated and will be removed in a future version of Kafka
Use --bootstrap-server
instead; for example
kafka-configs --bootstrap-server localhost:9092 --alter --entity-type topics --entity-name my_topic --add-config retention.ms=100
and
kafka-configs --bootstrap-server localhost:9092 --alter --entity-type topics --entity-name my_topic --delete-config retention.ms
Upvotes: 3
Reputation: 51764
Here are the steps to follow to delete a topic named MyTopic
:
rm -rf /tmp/kafka-logs/MyTopic-0
. Repeat for other partitions, and all replicaszkCli.sh
then rmr /brokers/MyTopic
If you miss you step 3, then Apache Kafka will continue to report the topic as present (for example when if you run kafka-list-topic.sh
).
Tested with Apache Kafka 0.8.0.
Upvotes: 52
Reputation: 2015
if you are using confluentinc/cp-kafka
containers here is the command to delete the topic.
docker exec -it <kafka-container-id> kafka-topics --zookeeper zookeeper:2181 --delete --topic <topic-name>
Success response:
Topic <topic-name> is marked for deletion.
Note: This will have no impact if delete.topic.enable is not set to true.
Upvotes: 0
Reputation: 3021
Tested in Kafka 0.8.2, for the quick-start example: First, Add one line to server.properties file under config folder:
delete.topic.enable=true
then, you can run this command:
bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic test
Then recreate it, for clients to continue operations against an empty topic
Upvotes: 48
Reputation: 301
From kafka 1.1
Purge a topic
bin/kafka-configs.sh --zookeeper localhost:2181 --alter --entity-type topics --entity-name tp_binance_kline --add-config retention.ms=100
wait at least 1 minute, to be secure that kafka purge the topic remove the configuration, and then go to default value
bin/kafka-configs.sh --zookeeper localhost:2181 --alter --entity-type topics --entity-name tp_binance_kline --delete-config retention.ms
Upvotes: 14
Reputation: 4814
Temporarily update the retention time on the topic to one second:
kafka-topics.sh \
--zookeeper <zkhost>:2181 \
--alter \
--topic <topic name> \
--config retention.ms=1000
And in newer Kafka releases, you can also do it with kafka-configs --entity-type topics
kafka-configs.sh \
--zookeeper <zkhost>:2181 \
--entity-type topics \
--alter \
--entity-name <topic name> \
--add-config retention.ms=1000
then wait for the purge to take effect (duration depends on size of the topic). Once purged, restore the previous retention.ms
value.
Upvotes: 459
Reputation: 18475
If you want to do this programmatically within a Java Application you can use the AdminClient's API deleteRecords
. Using the AdminClient allows you to delete records on a partition and offset level.
According to the JavaDocs this operation is supported by brokers with version 0.11.0.0 or higher.
Here is a simple example:
String brokers = "localhost:9092";
String topicName = "test";
TopicPartition topicPartition = new TopicPartition(topicName, 0);
RecordsToDelete recordsToDelete = RecordsToDelete.beforeOffset(5L);
Map<TopicPartition, RecordsToDelete> topicPartitionRecordToDelete = new HashMap<>();
topicPartitionRecordToDelete.put(topicPartition, recordsToDelete);
// Create AdminClient
final Properties properties = new Properties();
properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, brokers);
AdminClient adminClient = AdminClient.create(properties);
try {
adminClient.deleteRecords(topicPartitionRecordToDelete).all().get();
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
} finally {
adminClient.close();
}
Upvotes: 4
Reputation: 911
have you considered having your app simply use a new renamed topic? (i.e. a topic that is named like the original topic but with a "1" appended at the end).
That would also give your app a fresh clean topic.
Upvotes: -2
Reputation: 780
Following command can be used to delete all the existing messages in kafka topic:
kafka-delete-records --bootstrap-server <kafka_server:port> --offset-json-file delete.json
The structure of the delete.json file should be following:
{ "partitions": [ { "topic": "foo", "partition": 1, "offset": -1 } ], "version": 1 }
where offset :-1 will delete all the records (This command has been tested with kafka 2.0.1
Upvotes: 28
Reputation: 1280
A lot of great answers over here but among them, I didn't find one about docker. I spent some time to figure out that using the broker container is wrong for this case (obviously!!!)
## this is wrong!
docker exec broker1 kafka-topics --zookeeper localhost:2181 --alter --topic mytopic --config retention.ms=1000
Exception in thread "main" kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
at kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:258)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:254)
at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:112)
at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1826)
at kafka.admin.TopicCommand$ZookeeperTopicService$.apply(TopicCommand.scala:280)
at kafka.admin.TopicCommand$.main(TopicCommand.scala:53)
at kafka.admin.TopicCommand.main(TopicCommand.scala)
and I should have used zookeeper:2181
instead of --zookeeper localhost:2181
as per my compose file
## this might be an option, but as per comment below not all zookeeper images can have this script included
docker exec zookeper1 kafka-topics --zookeeper localhost:2181 --alter --topic mytopic --config retention.ms=1000
the correct command would be
docker exec broker1 kafka-configs --zookeeper zookeeper:2181 --alter --entity-type topics --entity-name dev_gdn_urls --add-config retention.ms=12800000
Hope it will save someone's time.
Also, be aware that the messages won't be deleted immediately and it will happen when the segment of the log will be closed.
Upvotes: 5
Reputation: 7071
Following @steven appleyard answer I executed the following commands on Kafka 2.2.0 and they worked for me.
bin/kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name <topic-name> --describe
bin/kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name <topic-name> --alter --add-config retention.ms=1000
bin/kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name <topic-name> --alter --delete-config retention.ms
Upvotes: 7
Reputation: 7862
From Java, using the new AdminZkClient
instead of the deprecated AdminUtils
:
public void reset() {
try (KafkaZkClient zkClient = KafkaZkClient.apply("localhost:2181", false, 200_000,
5000, 10, Time.SYSTEM, "metricGroup", "metricType")) {
for (Map.Entry<String, List<PartitionInfo>> entry : listTopics().entrySet()) {
deleteTopic(entry.getKey(), zkClient);
}
}
}
private void deleteTopic(String topic, KafkaZkClient zkClient) {
// skip Kafka internal topic
if (topic.startsWith("__")) {
return;
}
System.out.println("Resetting Topic: " + topic);
AdminZkClient adminZkClient = new AdminZkClient(zkClient);
adminZkClient.deleteTopic(topic);
// deletions are not instantaneous
boolean success = false;
int maxMs = 5_000;
while (maxMs > 0 && !success) {
try {
maxMs -= 100;
adminZkClient.createTopic(topic, 1, 1, new Properties(), null);
success = true;
} catch (TopicExistsException ignored) {
}
}
if (!success) {
Assert.fail("failed to create " + topic);
}
}
private Map<String, List<PartitionInfo>> listTopics() {
Properties props = new Properties();
props.put("bootstrap.servers", kafkaContainer.getBootstrapServers());
props.put("group.id", "test-container-consumer-group");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
Map<String, List<PartitionInfo>> topics = consumer.listTopics();
consumer.close();
return topics;
}
Upvotes: 2
Reputation: 51
./kafka-topics.sh --describe --zookeeper zkHost:2181 --topic myTopic
This should give retention.ms
configured. Then you can use above alter command to change to 1second (and later revert back to default).
Topic:myTopic PartitionCount:6 ReplicationFactor:1 Configs:retention.ms=86400000
Upvotes: 2
Reputation: 1263
Another, rather manual, approach for purging a topic is:
in the brokers:
sudo service kafka stop
sudo rm -R /kafka-storage/kafka-logs/<some_topic_name>-*
in zookeeper:
sudo /usr/lib/zookeeper/bin/zkCli.sh
rmr /brokers/topic/<some_topic_name>
in the brokers again:
sudo service kafka start
Upvotes: 2
Reputation: 6418
UPDATE: This answer is relevant for Kafka 0.6. For Kafka 0.8 and later see answer by @Patrick.
Yes, stop kafka and manually delete all files from corresponding subdirectory (it's easy to find it in kafka data directory). After kafka restart the topic will be empty.
Upvotes: 5
Reputation: 37
To clean up all the messages from a particular topic using your application group (GroupName should be same as application kafka group name).
./kafka-path/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic topicName --from-beginning --group application-group
Upvotes: 2
Reputation: 470
kafka don't have direct method for purge/clean-up topic (Queues), but can do this via deleting that topic and recreate it.
first of make sure sever.properties file has and if not add delete.topic.enable=true
then, Delete topic
bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic myTopic
then create it again.
bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic myTopic --partitions 10 --replication-factor 2
Upvotes: 9
Reputation: 1497
To purge the queue you can delete the topic:
bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic test
then re-create it:
bin/kafka-topics.sh --create --zookeeper localhost:2181 \
--replication-factor 1 --partitions 1 --topic test
Upvotes: 118
Reputation: 990
While the accepted answer is correct, that method has been deprecated. Topic configuration should now be done via kafka-configs
.
kafka-configs --zookeeper localhost:2181 --entity-type topics --alter --add-config retention.ms=1000 --entity-name MyTopic
Configurations set via this method can be displayed with the command
kafka-configs --zookeeper localhost:2181 --entity-type topics --describe --entity-name MyTopic
Upvotes: 60
Reputation: 4391
Thomas' advice is great but unfortunately zkCli
in old versions of Zookeeper (for example 3.3.6) do not seem to support rmr
. For example compare the command line implementation in modern Zookeeper with version 3.3.
If you are faced with an old version of Zookeeper one solution is to use a client library such as zc.zk for Python. For people not familiar with Python you need to install it using pip or easy_install. Then start a Python shell (python
) and you can do:
import zc.zk
zk = zc.zk.ZooKeeper('localhost:2181')
zk.delete_recursive('brokers/MyTopic')
or even
zk.delete_recursive('brokers')
if you want to remove all the topics from Kafka.
Upvotes: 4
Reputation: 49
The simplest approach is to set the date of the individual log files to be older than the retention period. Then the broker should clean them up and remove them for you within a few seconds. This offers several advantages:
In my experience with Kafka 0.7.x, removing the log files and restarting the broker could lead to invalid offset exceptions for certain consumers. This would happen because the broker restarts the offsets at zero (in the absence of any existing log files), and a consumer that was previously consuming from the topic would reconnect to request a specific [once valid] offset. If this offset happens to fall outside the bounds of the new topic logs, then no harm and the consumer resumes at either the beginning or the end. But, if the offset falls within the bounds of the new topic logs, the broker attempts to fetch the message set but fails because the offset doesn't align to an actual message.
This could be mitigated by also clearing the consumer offsets in zookeeper for that topic. But if you don't need a virgin topic and just want to remove the existing contents, then simply 'touch'-ing a few topic logs is far easier and more reliable, than stopping brokers, deleting topic logs, and clearing certain zookeeper nodes.
Upvotes: 4