WitchKingofAngmar
WitchKingofAngmar

Reputation: 182

makeCluster with parallelSVM in R takes up all Memory and swap

I'm trying to train a SVM model on a large dataset(~110k training points). This is a sample of the code where I am using the parallelSVM package to parallelize the training step on a subset of the training data on my 4 core Linux machine.

numcore = 4
train.time = c()
for(i in 1:5)
{
    cl = makeCluster(4)
    registerDoParallel(cores=numCore)
    getDoParWorkers()
    dummy = train_train[1:10000*i,]
    begin = Sys.time()
    model.svm = parallelSVM(as.factor(target) ~ .,data =dummy,
        numberCores=detectCores(),probability = T)
    end = Sys.time() - begin
    train.time = c(train.time,end)
    stopCluster(cl)
    registerDoSEQ()
}

The idea of this snippet of code is to estimate the time it'll take to train the model on the entire dataset by gradually increasing the size of the dummy training set. After running the code above for 10,000 and 20,000 training samples, this is the memory and swap history usage statistic from the System Monitor.After 4 runs of the for loop,both the memory and swap usage is about 95%,and I get the following error :

Error in summary.connection(connection) : invalid connection

Any ideas on how to manage this problem? Is there a way to deallocate the memory used by a cluster after using the stopCluster() function ?

Please take into consideration the fact that I am an absolute beginner in this field. A short explanation of the proposed solutions will be greatly appreciated. Thank you.

Upvotes: 0

Views: 898

Answers (1)

Hong Ooi
Hong Ooi

Reputation: 57686

Your line

registerDoParallel(cores=numCore)

creates a new cluster with number of nodes equal to numCore (which you haven't stated). This cluster is never destroyed, so with each iteration of the loop you're starting more new R processes. Since you're already creating a cluster with cl = makeCluster(4), you should use

registerDoParallel(cl)

instead.

(And move the makeCluster, registerDoParallel, stopCluster and registerDoSEQ calls outside the loop.)

Upvotes: 1

Related Questions