Reputation: 531
I have a system where we process text messages. Each message gets split up into sentences, and each sentence gets processed individually and the results of each sentence get published to a topic. This all happens asynchronously.
I want to be able to aggregate the results for the sentences.
The problem is that I want the window to end when the total number of sentences have been reached, or when a total amount of time has passed. Basically Tumbling time windows, but can end when a total number of results have been received.
Secondarily I want to be able to know when that window ends so that I can process the aggregation as an atomic event.
Upvotes: 3
Views: 1269
Reputation: 247
Now Apache Kafka offers you a way to wait closing the window. Here piece of code;
suppress(Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded()))
For more, check it out.
Upvotes: 0
Reputation: 4314
It's possible but you have to implement a custom processor - your requirements are simply to specific for the high-level API to cater for.
Your processor would store messages into a state store and use punctuate to periodically check if the window expired. It would also keep a running counter and check if the max number of results have been received. If either condition is met, it does the aggregation, removes messages from the state store and sends the results downstream.
You'd have to think about what to do on restart (failover/re-balancing). When starting up, the processor should inspect its state store and calculate the current running count and the window expiry time.
Upvotes: 2