Pythonic error handling of complex functions

Question

I'd like to know if there is a Pythonic way for handling errors in long-running functions that can have errors in part that do not affect the ability of the function to continue.

As an example, consider a function that given a list of URLs, it recursively retrieves the resource and all linked resources under the path of the top level URLs. It stores the retrieved resources in a local filesystem with a directory structure mirroring the URL structure. Essentially this is a basic recursive wget for a list of pages.

There are quite a number of points where this function could fail:

A URL may be invalid, or unresolvable
The host may not be reachable (perhaps temporarily)
Saving locally may have disk errors
anything else you can think of.

A failure on retrieving or saving any one resource only affects the function's ability to continue to process that resource and any child resources that may be linked from it, but it is possible to continue to retrieve other resources.

A simple model of error handling is that on the first error, an appropriate exception is raised for the caller to handle. The problem with this is that it terminates the function and does not allow it to continue. The error could possibly be fixed and the function restarted from the beginning but this would cause work to be redone, and any permanent errors may mean we never complete.

A couple of alternatives I have in mind are:

Record errors in a list as they occur and abort processing that resource any any child resources, but continue on to the next resource. A threshold could be used to abort the entire function if too many errors occur, or perhaps just try everything. The caller can interrogate this list at the completion of the function to see if there were any problems.
The caller could provide a callable object that is called with each error. This moves responsibility for recording errors back to the caller. You could even specify that if the callable returns False that processing should stop. This would move the threshold management to the caller.
Implement the former with the latter, providing an error handling object than encodes the former's behavior.

In Python discussions, I've often noted certain approaches described as Pythonic or non-Pythonic. I'd like to know if there are any particularly Pythonic approaches to handling the type of scenario described above.

Does Python have any batteries included that model more sophisticated error handling than the terminate model of exception handling, or do the more complex batteries included use a model of error handling that I should copy to stay Pythonic?

Note: Please do not focus on the example. I'm not looking to solve problems in that particular space, but it seemed like a good example that most people here would have an understanding of.

ncoghlan · Accepted Answer

I don't think there's a particularly clear "Pythonic/non-Pythonic" distinction at the level you're talking about here.

One of the big reasons there's no "one-size-fits-all" solution in this domain, is that the exact semantics you want are going to be problem specific.

For one situation, abort-on-first-failure may be adequate.
For another, you may want abort-and-rollback if any of the operations fails.
For a third, you may want to complete as many as possible and simply log-and-ignore failures
For a fourth alternative, you may want to complete as many as possible, but raise an exception at the end to report any that failed.

Even supporting an error handler doesn't necessarily cover all of those desired behaviours - a simple per-failure error handler can't easily provide abort-and-rollback semantics, or generate a single exception at the end. (It's not impossible - you just have to mess around with tricks like passing bound methods or closures as your error handlers)

So the best you can do is take an educated guess at typical usage scenarios and desirable behaviours in the face of errors, and design your API accordingly.

A fully general solution would accept an on-error handler that is given each failure as it happens, and a final "errors occurred" handler that gives the caller a chance to decide how multiple errors are handled (with some protocol to allow data to be passed from the individual error handlers to the final batch error handler).

However, providing such a general solution is likely to be an API design failure. The designer of the API shouldn't be afraid to have an opinion on how their API should be used, and how errors should be handled. The main thing to keep in mind is to not overengineer your solution:

if the naive approach is adequate, don't mess with it
if collecting failures in a list and reporting a single error is good enough, do that
if you need to rollback everything if one part fails, then just implement it that way
if there's a genuine use case for custom error handling, then accept an error handler as a part of the API. But have a specific use case in mind when you do this, don't just do it for the sake of it. And when you do, have a sensible default handler that is used if the user doesn't specify one (this may just be the naive "raise immediately" approach)
If you do offer selectable error handlers, consider offering some standard error handlers that can be passed in either as callables or as named strings (i.e. along the lines of the error handler selection for text codecs)

Perhaps the best you're going to get as a general principle is that "Pythonic" error handling will be as simple as possible, but no simpler. But at that point, the word is just being used as a synonym for "good code", which isn't really its intent.

On the other hand, it is slightly easier to talk about what actual forms non-Pythonic error handling might take:

def myFunction(an_arg, error_handler)
  # Do stuff
  if err_occurred:
    if isinstance(err, RuntimeError):
      error_handler.handleRuntimeError()
    elif  isinstance(err, IOError):
      error_handler.handleIOError()

The Pythonic idiom is that error handlers, if supported at all, are just simple callables. Give them the information they need to decide how to handle the situation, rather than try to decide too much on their behalf. If you want to make it easier to implement common aspects of the error handling, then provide a separate helper class with a __call__ method that does the dispatch, so people can decide whether or not they want to use it (or how much they want to override when they do use it). This isn't completely Python-specific, but it is something that folks coming from languages that make it annoyingly difficult to pass arbitrary callables around (such as Java, C, C++) may get wrong. So complex error handling protocols would definitely be a way to head into "non-Pythonic error handling" territory.

The other problem in the above non-Pythonic code is that there's no default handler provided. Forcing every API user to make a decision they may not yet be equipped to make is just poor API design. But now we're back in general "good code"/"bad code" territory, so Pythonic/non-Pythonic really shouldn't be used to describe the difference.

Pythonic error handling of complex functions

Answers (2)

Related Questions