Otávio Décio
Otávio Décio

Reputation: 74290

How far should I go to handle network problems in code?

I have clients with really bad networks, including bad mappings at the gateways and issues with aliasing. Sometimes they go days without a hitch, other days our services fail because they can't connect to the database or the connections get mysteriously dropped.

How far should a program (namely a service) go to recover or retry? Is it reasonable to have their network folks get it working properly or should I take upon myself to survive its flakiness?

Upvotes: 3

Views: 286

Answers (1)

Joe
Joe

Reputation: 830

1) Yes, it's reasonable to expect their network to work ... you wouldn't tell someone that the car they bought is broken because they don't have and roads to drive it on, would you?

2) That said: program defensively. When you build a car, you can't expect everything to be a perfectly smooth interstate highway.

More specifically, I like to build retry mechanisms into my systems: I'll wrap something in 'retryable' logic, which lets you specify the number of retries. Typically, the retry period will have quadratic backoff: say, it tries after n*n seconds, for 1..n where n is the number of retries, or use fib(n) so you have something like 1,1,2,3,5 second retries. The backoff helps prevent causing unnecessary strain on the upstream resource

If, after a set number of retries, you can either throw an exception (which can be caught and inform a user or other modules of the error), or logged, depending on the severity.

Upvotes: 2

Related Questions