Kafka streams - Multiple topics as same source or one topic per source?

Question

When building a Kafka Streams topology, reads from multiple topics can be modeled in two different ways:

Read all topics with the same source node.

topologyBuilder.addSource("sourceName", ..., "topic1", "topic2", "topic3");

Read each topic using a separate source node.

topologyBuilder.addSource("sourceName1", ..., "topic1")
               .addSource("sourceName2", ..., "topic2")
               .addSource("sourceName3", ..., "topic3");

Is there a relative advantage of option1 over option2 or vice versa? All topics contain the same type of data and have the same data processing logic.

user152468 · Accepted Answer

Given that, as you state, all input topics contain the same kind of data and subsequent processing of the data is equivalent, you should most probably go with option 1, for the following two reasons:

1) this will result in a smaller topology

2) you would only need to connect one source node to your subsequent processing steps

In case processing will need to be different for the different source topics at a later point in time, you could then split up the source node into multiple ones.

Kafka streams - Multiple topics as same source or one topic per source?

Answers (2)

Related Questions