Reputation: 3365
Under /tmp/streams-my-application-id
I found the files that RocksDB uses. My intention was to check the file size by du -h
.
When seeing the file name, I'm curious about the meaning of file name. What does the file names mean? I suppose it's related to the Kafka Streams tasks and the partitions.
Does the prefix 0 and 1 mean number of topics used, and the later is the partition used?
This KafkaStreams app simply joins two topics using KStream-KTable, and one topic is re-partition and reduce into KTable.
8,0K ./0_2
8,0K ./0_1
3,1M ./1_2/rocksdb/KSTREAM-REDUCE-STATE-STORE-0000000002
3,1M ./1_2/rocksdb
3,1M ./1_2
3,0M ./1_0/rocksdb/KSTREAM-REDUCE-STATE-STORE-0000000002
3,0M ./1_0/rocksdb
3,1M ./1_0
3,0M ./1_1/rocksdb/KSTREAM-REDUCE-STATE-STORE-0000000002
3,0M ./1_1/rocksdb
3,0M ./1_1
8,0K ./0_0
Upvotes: 1
Views: 177
Reputation: 20840
File names are derived using sub-topology and partition number.
Usually KStream application is splitted into number of sub-topologies (i.e. Sub-topology 0,1,2....etc). While using stateful transformation, state-store directories use that reference in order to generate the directory and file name like given below:
<sub-topology-number>_<partition_number>
So first number represents sub-topology and second one represents partition number
8,0K ./0_2 //directory
8,0K ./0_1 // diretory
3,1M ./1_2/rocksdb/KSTREAM-REDUCE-STATE-STORE-0000000002
And KSTREAM-REDUCE-STATE-STORE-0000000002
format is
<Processor Node Type>-<Processor Node number>
Upvotes: 3