Holm
Holm

Reputation: 3365

RocksDB fileanme meaning used by Kafka Streams

Under /tmp/streams-my-application-id I found the files that RocksDB uses. My intention was to check the file size by du -h.

When seeing the file name, I'm curious about the meaning of file name. What does the file names mean? I suppose it's related to the Kafka Streams tasks and the partitions.

Does the prefix 0 and 1 mean number of topics used, and the later is the partition used?

This KafkaStreams app simply joins two topics using KStream-KTable, and one topic is re-partition and reduce into KTable.

8,0K    ./0_2
8,0K    ./0_1
3,1M    ./1_2/rocksdb/KSTREAM-REDUCE-STATE-STORE-0000000002
3,1M    ./1_2/rocksdb
3,1M    ./1_2
3,0M    ./1_0/rocksdb/KSTREAM-REDUCE-STATE-STORE-0000000002
3,0M    ./1_0/rocksdb
3,1M    ./1_0
3,0M    ./1_1/rocksdb/KSTREAM-REDUCE-STATE-STORE-0000000002
3,0M    ./1_1/rocksdb
3,0M    ./1_1
8,0K    ./0_0

Upvotes: 1

Views: 177

Answers (1)

Nishu Tayal
Nishu Tayal

Reputation: 20840

File names are derived using sub-topology and partition number.

Usually KStream application is splitted into number of sub-topologies (i.e. Sub-topology 0,1,2....etc). While using stateful transformation, state-store directories use that reference in order to generate the directory and file name like given below:

<sub-topology-number>_<partition_number>

So first number represents sub-topology and second one represents partition number

8,0K    ./0_2    //directory
8,0K    ./0_1    // diretory
3,1M    ./1_2/rocksdb/KSTREAM-REDUCE-STATE-STORE-0000000002      

And KSTREAM-REDUCE-STATE-STORE-0000000002 format is

<Processor Node Type>-<Processor Node number>

Upvotes: 3

Related Questions