Reputation: 8354
I know that it is possible to consume a SQS queue using multiple threads. I would like to guarantee that each message will be consumed once. I know that it is possible to change the visibility timeout of a message, e.g., equal to my processing time. If my process spend more time than the visibility timeout (e.g. a slow connection) other thread can consume the same message.
What is the best approach to guarantee that a message will be processed once?
Upvotes: 17
Views: 14958
Reputation: 16215
What is the best approach to guarantee that a message will be processed once?
You're asking for a guarantee - you won't get one. You can reduce probability of a message being processed more than once to a very small amount, but you won't get a guarantee.
I'll explain why, along with strategies for reducing duplication.
For the most part, by taking advantage of SQS message visibility timeout, the chances of duplication from these sources are already pretty small - like fraction of a percent small.
If processing duplicates really isn't that bad (strive to make your message consumption idempotent!), I'd consider this good enough - reducing chances of duplication further is complicated and potentially expensive...
Ok, here we go down the rabbit hole... at a high level, you will want to assign unique ids to your messages, and check against an atomic cache of ids that are in progress or completed before starting processing:
Upvotes: 42
Reputation: 477
You can use setVisibilityTimeout() for both messages and batches, in order to extend the visibility time until the thread has completed processing the message.
This could be done by using a scheduledExecutorService, and schedule a runnable event after half the initial visibility time. The code snippet bellow creates and executes the VisibilityTimeExtender every half of the visibilityTime with a period of half the visibility time. (The time should to guarantee the message to be processed, extended with visibilityTime/2)
private final ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);
ScheduledFuture<?> futureEvent = scheduler.scheduleAtFixedRate(new VisibilityTimeExtender(..), visibilityTime/2, visibilityTime/2, TimeUnit.SECONDS);
VisibilityTimeExtender must implement Runnable, and is where you update the new visibility time.
When the thread is done processing the message, you can delete it from the queue, and call futureEvent.cancel(true) to stop the scheduled event.
Upvotes: 1
Reputation: 13166
AWS SQS API doesn't automatically "consume" the message when you read it with API,etc. Developer need to make the call to delete the message themselves.
SQS does have a features call "redrive policy" as part the "Dead letter Queue Setting". You just set the read request to 1. If the consume process crash, subsequent read on the same message will put the message into dead letter queue.
SQS queue visibility timeout can be set up to 12 hours. Unless you have a special need, then you need to implement process to store the message handler in database to allow it for inspection.
Upvotes: 1
Reputation: 178956
Store the message, or a reference to the message, in a database with a unique constraint on the Message ID, when you receive it. If the ID exists in the table, you've already received it, and the database will not allow you to insert it again -- because of the unique constraint.
Upvotes: 2