pteehan
pteehan

Reputation: 817

plyr parallel error-handling and warnings

This is a common construction I use for error-handling:

x <- tryCatch(foo(), error=function(e){
    warning(e)
    NULL})

I run foo against lots of data objects, some of which might fail for whatever reason, if so I want the result to be NULL so that my entire run doesn't stop, but I also want to have the warning so I can see what failed and why.

I often run these from plyr, like this, and let's suppose some of them fail:

x <- llply(1:4, .fun=function(i) {  
    result<-tryCatch({
        if(i %% 2==0) stop(i)
        i}, error=function(e) {
           warning(e)
           NULL})
    result})
 x

Result:

 Warning messages:
 1: In doTryCatch(return(expr), name, parentenv, handler) : 2
 2: In doTryCatch(return(expr), name, parentenv, handler) : 4

 > x
 [[1]]
 [1] 1

 [[2]]
 NULL

 [[3]]
 [1] 3

 [[4]]
 NULL

However suppose I turn on parallel computing with the same code.

 require(doParallel)
 registerDoParallel(cores=4)
 x <- llply(1:4, .parallel=TRUE, .fun=function(i) {  
      result<-tryCatch({
          if(i %% 2==0) stop(i)
          i}, error=function(e) {
              warning(e)
              NULL})
      result})

  Result: 
  Error in do.ply(i) : task 2 failed - "2"

The job fails on an error in any of the tasks and no result is constructed. warning(e) was somehow converted to an error. I can get around this by commenting out warning(e) and then I get the desired result of NULLs in my data structure when there was an error, but then I lose the information about what happened.

In fact, I don't know any good way to throw warnings from parallel plyr. They seem to be squelched. If that's a limitation as a consequence of parallelism, that makes sense. But I think the warnings becoming errors behaviour is weird and I'd like to understand what's going on here.

Upvotes: 2

Views: 892

Answers (1)

Steve Weston
Steve Weston

Reputation: 19677

It appears to me that there is something wrong with the warning function when it is called with a simpleError object. It seems to work fine:

> warning(simpleError(1))
Warning message:
1 

but oddly enough, the warning is treated as an error when called inside tryCatch:

> tryCatch({
+   warning(simpleError(1))
+ }, error=function(e) {
+   cat('caught an error\n')
+   print(class(e))
+   print(e)
+ })
caught an error
[1] "simpleError" "error"       "condition"  
<simpleError: 1>

Since the foreach package evaluates the body of the loop in tryCatch, it thinks that an error has occurred. For example:

> library(foreach)
> foreach(i=1:4) %do% warning(simpleError(1))
Error in warning(simpleError(1)) : task 1 failed - "1"

That means that passing the .errorhandling='pass' option to foreach via the .paropts option should prevent the error from aborting llply:

> x <- llply(1:4, .parallel=TRUE, .paropts=list(.errorhandling='pass'),
+       .fun=function(i) {
+       result<-tryCatch({
+           if(i %% 2==0) stop(i)
+           i}, error=function(e) {
+               warning(e)
+               NULL})
+       result})
> x
[[1]]
[1] 1

[[2]]
<simpleError in doTryCatch(return(expr), name, parentenv, handler): 2>

[[3]]
[1] 3

[[4]]
<simpleError in doTryCatch(return(expr), name, parentenv, handler): 4>

It looks like you can fix this problem by changing the class of the simpleError object to a simpleWarning before calling warning:

x <- llply(1:4, .parallel=TRUE,
      .fun=function(i) {
      result<-tryCatch({
          if(i %% 2==0) stop(i)
          i}, error=function(e) {
              class(e) <- class(simpleWarning(''))
              warning(e)
              NULL})
      result})

If you want to get the warnings when running sequentially or in parallel, you could convert the error objects into warning objects and return them with the other results. For example:

x <- llply(1:4, .parallel=TRUE, .fun=function(i) {
         result<-tryCatch({
             if(i %% 2==0) stop(i)
             i}, error=function(e) {
                class(e) <- class(simpleWarning(''))
                e})
         result})

for (i in seq_along(x)) {
    if (inherits(x[[i]], 'simpleWarning')) {
        warning(x[[i]])
        x[i] <- list(NULL)
    }
}

Upvotes: 2

Related Questions