Reputation: 2429
I have a kafka topic HrEvents
, which contains a lot of Hire
, Fire
, Quit
, Promotion
and Demotion
messages. Each HR event message has an employee_id
property (also the key used for partitioning) and a data
property which may contain arbitrary details about the HR event.
The problem is that the various data
blobs that my application needs to be able to handle are not well documented, and there is a chance that - at any moment - a HR event may be consumed that the application cannot process.
It is important that - for each employee_id
- the application processes all HR events in order. It is also important that following such a processing failure affecting one employee_id
, all HR events for all other employee_id
s can continue.
The failing HR event, and all subsequent HR events for the same employee_id
should be published to a dead letter queue. Once the application has been patched - and support for another undocumented form of data
blob has been added - these HR events can be consumed from the dead letter queue.
I realize that this also requires maintaining some form of key blacklist in the consumer, inside which employee_id
s for which at least one unconsumed HR event message sits in the dead letter queue are stored.
Are there existing solutions/java libraries that allow me to implement a solution to this problem?
Please forgive my ignorance, but I'm trying to find a solution for the problem described above, but I suspect I might not be searching with the correct jargon. Feel free to educate me.
Upvotes: 1
Views: 851
Reputation: 191681
Sounds like you should be able to utilize Kafka Streams for this.
Your dead letter queue can build up a KTable, which forms a type of blacklist. As new events come in the original topic, you'd perform lookups against the KTable for existing ids and append incoming events into the value list of events yet to process for that ID
Upvotes: 1