Jonas
Jonas

Reputation: 1639

parallelization of bnlearn (with parallel package)

I am using the R package bnlearn to estimate Bayesian network structures. It has a built-in parallelization using the parallel package. However, that doesn't work.

Using the example from the manpage bnlearn::parallel integration:

library(parallel)
library(bnlearn)

cl = makeCluster(2)

# check it works.
clusterEvalQ(cl, runif(10))    # -> this works

data(learning.test)
res = gs(learning.test, cluster = cl)

Here i get the error "Error in check.cluster(cluster) : cluster is not a valid cluster object."

Does anybody know how to get this working?

Upvotes: 4

Views: 1866

Answers (1)

Roland
Roland

Reputation: 132959

This is a bug. Please report it to the package maintainer.

Here is the code of check.cluster:

function (cluster) 
{
    if (is.null(cluster)) 
        return(TRUE)
    if (any(class(cluster) %!in% supported.clusters)) 
        stop("cluster is not a valid cluster object.")
    if (!requireNamespace("parallel")) 
        stop("this function requires the parallel package.")
    if (!isClusterRunning(cluster)) 
        stop("the cluster is stopped.")
}

Now, if you look at the class of cl:

class(cl)
#[1] "SOCKcluster" "cluster" 

Let's reproduce the check:

bnlearn:::supported.clusters
#[1] "MPIcluster"  "PVMcluster"  "SOCKcluster"

`%!in%` <- function (x, table) {
  match(x, table, nomatch = 0L) == 0L
}
any(class(cl) %!in% bnlearn:::supported.clusters)
#[1] TRUE

cluster is not in supported.clusters. I believe, the function should only check if the cluster has a supported class and not if it has an unsupported class.

As a work-around you could change supported.clusters:

assignInNamespace("supported.clusters", 
                  c("cluster", "MPIcluster",  
                    "PVMcluster", "SOCKcluster"), 
                  "bnlearn")

Upvotes: 6

Related Questions