Reputation: 9761
Can it be safe to say that in all and all in Kafka Stream, Tasks represent subscriptions to partitions, while Threads represent consumers ?
That is, if there is 8 partition there will always be 8 Tasks. However the number of consumers is determine by the number of Thread available. Those are spread across Application instance. So one application instance may represent 2 consumer provided that is has 2 Thread associated to it.
For full parallelism, with a topic with 8 partitions we could have 2 application instance with each having 4 Thread, or one application instance with 8 Threads and so on.
Upvotes: 1
Views: 1043
Reputation: 20830
Yeah, Number of tasks will be equal to maximum number of partitions in any Kafka stream app
In case there are two topics "A" and "B" each having 8 partitions. So no. of tasks will be max(8,8) = 8. Now each consumer represents a thread. If you set of threads as 2, so 2 threads will distribute the tasks between each other. Each thread will get 4 tasks to process.
For full parallelism, with a topic with 8 partitions we could have 2 application instance with each having 4 Thread, or one application instance with 8 Threads and so on.
You should use number of threads to the maximum number of partitions always in order to achieve the full parallelism. You can either do it in several application instances or one.
Here is a nicely explained Threading model of Kstream.
https://docs.confluent.io/current/streams/architecture.html#parallelism-model
Upvotes: 1