Reputation: 373
We are using google-cloud-pubsub (0.24.0-beta) pull client for reading messages from subscriber and seeing high rate of duplicates in that. Google documentation says that little duplication is expected but in our case, we are seeing 80% of messages are getting duplicated even after acknowledgement.
The most weird part is, even if we acknowledge the message immediately in receiver using consumer.ack(), duplicates are still occurring. Does anybody know how to handle this.
Upvotes: 1
Views: 1151
Reputation: 17261
A large number of message duplicates could be the result of flow control settings being set too high or too low. If your flow control settings are too high, where you are allowing too many messages to be outstanding to your client at the same time, then it is possible that the acks are being set too late. If this is the cause, you would probably see the CPU of your machine at or near 100%. In this case, try setting the max number of outstanding messages or bytes to a lower number.
It could also be the case that the flow control settings are set too low. Some messages get buffered in the client before they are delivered to your MessageReceiver, particularly if you are flow controlled. In this case, messages may spend too much time buffered in the client before they are delivered. There is an issue with messages in this state that is being fixed in an outstanding PR. In this scenario, you could either increase your max outstanding bytes or messages (up to whatever your subscriber can actually handle) or you can try to setAckExpirationPadding to a larger value than the default 500ms.
It is also worth checking your publisher to see if it is unexpectedly publishing messages multiple times. If that is the case, you may see the contents of your messages being the same, but they aren't duplicate messages being generated by Google Cloud Pub/Sub itself.
Edited to mention bug that was in the client library:
If you were using a version of google-cloud-pubsub between v0.22.0 and v0.29.0, you might have been running into an issue where a change in the underlying mechanism for getting messages could result in excessive duplicates. The issue has since been fixed.
Upvotes: 2