Reputation: 784
For Metrics we meed to see the total size of a Kafka Topic in bytes across all partitions and brokers.
I have been searching for quite a while on how to do this and I haven't worked out if this is possible and how to do it.
We are on V0.82 of Kafka.
Upvotes: 36
Views: 81533
Reputation: 1
bin/kafka-log-dirs.sh \
--bootstrap-server localhost:9092 \
--topic-list event \
--describe \
| grep -oP '(?<=size":)\d+' \
| awk '{ sum += $1 } END { printf "%.2f\n", sum / (1024^3) }'
this solution by MSE worked , just tweaked a litle to show data in GBs
Upvotes: 0
Reputation: 161864
With this command, you will get a list of topic
details:
kafka-log-dirs.sh --bootstrap-server 127.0.0.1:9092 --describe |
grep '^{' |
jq -c '.brokers[].logDirs[].partitions | map(.topic=(.partition|sub("-\\d+$";""))) | group_by(.topic)[] | {topic:.[0].topic, partitions:length, size:map(.size)|add}'
{"topic":"topic1","partitions":1,"size":1234}
{"topic":"topic2","partitions":2,"size":5678}
{"topic":"topic3","partitions":3,"size":0}
Upvotes: 6
Reputation: 159
If you wanted size per broker instead of total topic size, I created this query to help with that
kafka-log-dirs.sh --describe --bootstrap-server localhost:9092 --topic-list ${topic_name} --describe | grep '^{' | jq '[.brokers[] | {broker:.broker, size:[.logDirs[].partitions[].size] | add}]' | less
Returns topic size per broker by summing up individual partition sizes. Useful when debugging issues with uneven partition distribution/hot partitions.
Sample output:
{
"broker": 7031,
"size": 182197855891
},
{
"broker": 6066,
"size": 182357034551
},
{
"broker": 6052,
"size": 184447693788
},
Upvotes: 2
Reputation: 2171
For people looking to have the output in readable format and a list for all topics, here it is:
bin/kafka-topics.sh --bootstrap-server 127.0.0.1:9092 --list \
| xargs -I{} sh -c \
"echo -n '{} -> ' && bin/kafka-log-dirs.sh --bootstrap-server 127.0.0.1:9092 --topic-list {} --describe | grep '^{' | jq '[ ..|.size? | numbers ] | add' | numfmt --to iec --format '%8.4f'" \
| tee /tmp/topics-by-size.list
This will:
List all topics in Kafka
Pass through `xargs` that will execute a command per topic
Get all logs sizes per topic
sum each of the logs
pass through `numfmt` to make it human readable
save to a file while printing to stdout
I hope this helps people who wanted a copy and paste command.
Upvotes: 5
Reputation: 1688
If you are running kafka in a docker container (wurstmeister/kafka) and you are getting
Error: JMX connector server communication error: service:jmx:rmi ...
sun.management.AgentConfigurationError: java.rmi.server.ExportException: Port already in use: 6099; nested exception is:
java.net.BindException: Address in use (Bind failed)
You need to unset the JMX_PORT
before you run the shell script.
(unset JMX_PORT; ./kafka-log-dirs.sh \
--bootstrap-server 127.0.0.1:9092 --topic-list test --describe)
Upvotes: 1
Reputation: 541
You can see the partition size using the script /bin/kafka-log-dirs.sh
/bin/kafka-log-dirs.sh --describe --bootstrap-server <KafakBrokerHost>:<KafakBrokerPort> --topic-list <YourTopic>
Upvotes: 52
Reputation: 695
Another way of doing the same with regular expression and awk (in case you dont have jq installed) is:
$ bin/kafka-log-dirs.sh \
--bootstrap-server 127.0.0.1:9092 \
--topic-list test \
--describe \
| grep -oP '(?<=size":)\d+' \
| awk '{ sum += $1 } END { print sum }'
This returns the size (in bytes) of the topic test
including its replications. In case you have a replication factor greater than 1 and you want the size of the unique topic message, divide the value you get with the replication factor.
Upvotes: 11
Reputation: 1875
As Martbob very helpfully mentioned, you can do this using kafka-log-dirs. This produces JSON output (on one of the lines). So I can use the ever-so-useful jq
tool to pull out the 'size' fields (some are null), select only the ones that are numbers, group them into an array, and then add them together.
kafka-log-dirs \
--bootstrap-server 127.0.0.1:9092 \
--topic-list 'topic_of_interest' \
--describe \
| grep '^{' \
| jq '[ ..|.size? | numbers ] | add'
Example output: 67704
I haven't verified if the output makes sense, so you should check that yourself.
Upvotes: 27