Reputation: 408
Let's say we have an inData Topic with 5 Partitions, that contains contract data and the contractId as key. I have 3 instances of a Kafka streams application, which counts the number of contracts.
Now I want to implement a total count of contracts in my Kafka streams application. Now I read that each stream application is assigned to only one partition. That means, that each instance of the Kafka streams application has only the count of each partition?
How can be a total count of contracts implemented? Do I need an intermediate topic with only one partition? Can it be achieved using a globalTable?
Upvotes: 0
Views: 919
Reputation: 62285
Using a GlobalKTable
or global state store wouldn't work (at least not directly), because both can only store unmodified data from a topic, however, you want to do some processing (ie, counting).
If you want to count all unique contactId
you should first load the data into a KTable
(via builder.table()
) and afterward do a groupBy().count()
-- in the groupBy()
you map all record to the same new key. Because all records are mapped to the same key they will be repartitioned to the same topic partition and thus you get a global count.
Upvotes: 1