Reputation: 486
I have a short project where I push a number of messages (~1000) and then I try to process them on a single thread, but I still receive duplicates.
Is this a desired behavior of PubSub?
this is the code to create a subscriber
ExecutorProvider executorProvider =
InstantiatingExecutorProvider.newBuilder().setExecutorThreadCount(1).build();
// create subscriber
subscriber = Subscriber.newBuilder(subscriptionName, messageReceiver).setExecutorProvider(executorProvider).build();
subscriber.startAsync();
Here is the demo: https://github.com/andonescu/play-pubsub
I've pushed 1000 messages, each process took 300 milliseconds (delay added intentionally) then ack() was called. The ack time on subscription is 10. Based on all these I should not receive duplicate messages, but I've received more than 10% of those sent.
here is the log: https://github.com/andonescu/play-pubsub/blob/master/reports/1000-messages-reader-status
I've added same question on https://github.com/GoogleCloudPlatform/pubsub/issues/182
Upvotes: 1
Views: 995
Reputation: 486
just looking very attentive through PubSub documentation and I've discovered the following part:
However, messages may sometimes be delivered out of order or more than once. In general, accommodating more-than-once delivery requires your subscriber to be idempotent when processing messages. You can achieve exactly once processing of Cloud Pub/Sub message streams using Cloud Dataflow PubsubIO. PubsubIO de-duplicates messages on custom message identifiers or those assigned by Cloud Pub/Sub.
https://cloud.google.com/pubsub/docs/subscriber#at-least-once-delivery
It seams that Cloud Dataflow PubsubIO
is the key in my case.
or use an UniqueId
and do the de-duplication in the client :)
Upvotes: 1