loki
loki

Reputation: 10360

How to reduce size of randomForest object

I am trying to predict a randomForest Object to a huge Raster Layer (34 mio cells, 120+ layers). Therefore, I use the clusterR function within the raster package. However, if I start to predict the previously calculated randomForest object, it is loaded into all parallel workers. Thus, the all the processes combined need a lot of memory.

Is it possible to reduce the size of a randomForest object, without loosing the model? Does anyone have experience with this?

I create the model like this:

library(randomForest)

set.seed(42)
df <- data.frame(class = sample(x = 1:3, size = 10000, replace = T))
str(df)

for (i in 1:100){
  df <- cbind(df, runif(10000))
}

colnames(df) <- c("class", 1:100)

df$class <- as.factor(df$class)

rfo <- randomForest(x = df[,2:ncol(df)], 
                    y = df$class, 
                    ntree = 500, 
                    do.trace = 10)

object.size(rfo) 
# 57110816 bytes

Upvotes: 1

Views: 706

Answers (0)

Related Questions