Zazaeil
Zazaeil

Reputation: 4119

Kafka Streams: does NUM_STREAM_THREADS_CONFIG > 1 break partition's total ordering?

Here we go: I got quite complicated topology of various joins, aggregations, filters, maps, etc. By defaul the NUM_STREAM_THREADS_CONFIG parameter equals to 1 and that's completely determenistic by definition - thus, partition's total ordering (that is guaranteed by Kafka itself) preserved.

Will total ordering be preserved once I set NUM_STREAM_THREADS_CONFIG to 2 or more then that? Does it depend upon special topology? I've checked the docs and went throught the threading model section, yet did not find an answer.

Upvotes: 1

Views: 285

Answers (1)

Matthias J. Sax
Matthias J. Sax

Reputation: 62330

Data is always processed in per-partition offset order, even if you set num.stream.threads to a larger value.

In Kafka Streams, sub-topologies are translated into tasks (based on input topic partitions) and tasks process records of their partitions in offset order. The number of tasks limits the number of threads you can keep busy (similar to the maximum number of consumers in a consumer group). If you configure more threads than available tasks, some threads just stay idle.

If a task processed data from multiple topics/partitions, there is no strict ordering guarantee for data of different partitions. Kafka Streams will take the record timestamps into account thought, and process records with smaller timestamp first.

Upvotes: 3

Related Questions