Lars Kemmann
Lars Kemmann

Reputation: 5674

Why does my Event Hub exponential retry policy not continue forever?

I expect the following code to retry forever, but it consistently throws an exception after ~40 seconds of being disconnected from the network -- which is the best I can do to simulate transient Internet connection or Event Hub outages.

var eventHubClient = EventHubClient.CreateFromConnectionString(…);
eventHubClient.RetryPolicy = new RetryExponential(
    TimeSpan.FromSeconds(5), // minBackoff
    TimeSpan.FromMinutes(2), // maxBackoff
    Int32.MaxValue); // maxRetryCount
…
eventHubClient.SendAsync(…).Wait();

What's going on here?

I'm using the WindowsAzure.ServiceBus NuGet package, version 4.1.10 on .NET Framework 4.6. I'm willing to change either package or framework if needed.

Upvotes: 1

Views: 2149

Answers (1)

Lars Kemmann
Lars Kemmann

Reputation: 5674

This answer to a related question put me on the right track. It turns out that, buried inside the class hierarchy for EventHubClient, there's a reference to the Service Bus package's MessagingFactory -- which has its own per-operation timeout that is not exposed through the EventHubClient type.

So the way to set up a client with a very long timeout would be something like this:

var builder = new ServiceBusConnectionStringBuilder(connectionString)
{
    OperationTimeout = TimeSpan.FromDays(30), // TimeSpan.MaxValue is not allowed
    TransportType = TransportType.Amqp // This needs to be set; default is NetMessaging
};
var messagingFactory = MessagingFactory.CreateFromConnectionString(builder.ToString());
var eventHubClient = messagingFactory.CreateEventHubClient(entityPath);
eventHubClient.RetryPolicy = new RetryExponential(
    TimeSpan.FromSeconds(5),
    TimeSpan.FromMinutes(2),
    Int32.MaxValue);

That being said, after verifying that this works I realized that in my scenario it's actually important to send telemetry during the retries, so I actually went back to the default retry policy on EventHubClient and just wrapped the SendAsync(…) call with a separate retry policy using the Polly library that allows me to send telemetry before each retry.

Upvotes: 2

Related Questions