Reputation: 3701
So, as far as I understand from Transactions in Apache Kafka, a read_committed consumer will not return the messages which are part of an ongoing transaction. So, I guess, the consumer will have the option to commit its offset past those ongoing-transaction messages (e.g. to read non-transactional messages) or the option to make no further advancement until the encountered transaction is committed/aborted. I just suppose it will be allowed (by Kafka) to skip those pending-transaction records, but then how the consumer will read them when committed considering that its offset might already be far away?
UPDATE
Consider that the topic could have a mix of records (aka messages) coming from non-transactional producers and transactional ones. E.g., consider this partition from a topic:
non-transact-Xmsg, from-transact-producer1-msg, from-transact-producer2-msg, non-transact-Ymsg
If the consumer encounters from-transact-producer1-msg will he skip the message then read non-transact-Ymsg or will he just hang before the not yet committed from-transact-producer1-msg and by doing so it won't read non-transact-Ymsg?
Consider also that could be many transactional producers and so many equivalents of from-transact-producer1-msg, some committed some not. So, from-transact-producer2-msg could be a committed one at the point when the consumer reached non-transact-Xmsg.
Upvotes: 2
Views: 2701
Reputation: 4365
From docs about isolation.level
:
Messages will always be returned in offset order. Hence, in
read_committed
mode,consumer.poll()
will only return messages up to thelast stable offset (LSO)
, which is the one less than the offset of the first open transaction. In particular any messages appearing after messages belonging to ongoing transactions will be withheld until the relevant transaction has been completed. As a result,read_committed
consumers will not be able to read up to the high watermark when there are in flight transactions.
Upvotes: 2
Reputation: 1480
Your requirement is not 100% clear, but if I get it right, you want to be able to re-process some consumed messages that for some reason you could not successfully process when consuming them for the first time. And- you don't want to get "stuck" on those messages, you prefer moving on and handle those later. In that case, the best option would probably be writing them to a different queue, and have another consumer read those "failed" messages and retry as much as you'd like.
Upvotes: 0