posthumecaver
posthumecaver

Reputation: 1853

Which version of Kafka Stream would be more efficient?

I am trying to program a Kafka Stream and I have to join two streams and I like to ask which one of the options will be more efficient?

I have one Kafka TopicA with AvroObject1 with 10 millions AvroObject1's and another TopicB with AvroObject2 with 50000 AvroObject2.

which one of the following stream join configurations would be more efficient (or there will be any difference at all?)

avroObject1Stream
   .join(avroObject2Stream)

or

avroObject2Stream
  .join(avroObject1Stream)

and as a follow up question, at TopicA I have a one day retention and TopicB 10 days and I use the following JoinWindows configuration....

avroObject1Stream
   .join(avroObject2Stream,
            JoinWindow.of(Duration.ofDays(10)).grace(Duration.ofDays(10)))

Now I know that log retention for the stream join topic is JoinWindows maintain time + 1 day (out of the box) but what this will mean for the TopicA 1 day retention, the AvroObject1's will disappear from TopicA when they are older then 1 day but will they be still visible at Stream Join Topic after 1 day or Kafka retention action will make them disappear from Join Topic?

Thx for answers...

Upvotes: 0

Views: 59

Answers (1)

Matthias J. Sax
Matthias J. Sax

Reputation: 62350

It's stream processing, hence, the "number of objects" does not matter. Streams are conceptually infinite anyway. Therefore, both program are the same and there is not difference for inner-joins with stream is the left and which stream is the right.

The changelog retention time of the join, does not impact the retention time of your input topics. For the join, what basically happens is, that each input record is copied into a local store and into an additional changelog topic. If your data is deleted from the input topic the data does not get deleted from the store of changelog topic. The store and changelog topic deletes its record copies, after store retention time passed.

Upvotes: 1

Related Questions