CuriousMind
CuriousMind

Reputation: 8903

What exactly is StreamTask in StreamThread in kafka streams?

I am trying to understand how Kafka Stream work under the hood (to know it a little better), and came across confluent link, and it is really wonderful.

It says two terms viz: StreamThreads and StreamTasks.

I am not able to understand what exactly is StreamTasks?

Any explanation in simple words would be of great help.

Upvotes: 1

Views: 325

Answers (1)

Matthias J. Sax
Matthias J. Sax

Reputation: 62285

"Tasks" are a logical abstractions of work than can be done in parallel (ie, stuff that can be processed independent from each other). Kafka Streams basically creates a task for each input topic partition, because data in different partitions can processed independent from each other (it's a simplification, but holds if you have a single input topic; for joins it's a little bit different).

A StreamThread is basically a JVM thread. Task are assigned to StreamsThread for execution. In the current implementation, a StreamThread basically loops over all tasks and processes some amount of input data for each task. In between, the StreamThread (that is using a KafkaConsumer) polls the broker for new data for all its assigned tasks.

Because tasks are independent from each other, you can run as many thread as there are tasks. For this case, each thread would execute only a single task.

Upvotes: 5

Related Questions