Reputation: 20390
How do I get the current offset, or offset by partition, or record count for a given topic? It doesn't need to be perfect, but I want a ballpark idea of how much data is in a Kafka topic.
Upvotes: 1
Views: 2369
Reputation: 39810
In order to get the offset for the partitions of a topic you can use kafka.tools.GetOffsetShell
./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic your_topic_name --time -1
If you want to get the latest offset for a particular group, you can also use:
./bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic your_topic_name--zookeeper localhost:2181 --group your_group_id
In order to count the entries within a topic, you can either consume the whole topic (when you stop the consumer the total number of consumed messages will be reported). Alternatively, you can use
./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list <broker>: <port> --topic <topic-name> --time -1 --offsets 1 | awk -F ":" '{sum += $3} END {print sum}'
Upvotes: 2