Reputation: 155
as per glmulti package document, chunks are arguments for using multi CPUs.
when using exhaustive screening.
But, even when I put 4 in both chunk and chunks and method='h' with family='binomial', R only use a single core.
the function I used
glmulti(y~. ,level=1,data=ctrain,fitfunction = 'glm',chunk = 4, chunks = 4,method = 'h',family='binomial')
demo data set that is similar to mine:- https://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank-additional.zip
PS: any other the package that solve problem is also acceptable.
Upvotes: 2
Views: 402
Reputation: 2738
## try chunk of chunks, per vignette:
chunk1of2 <- glmulti(mod,
level=2,
method="h",
marginality=TRUE,
name="exhausting_glm",
chunks=2,
chunk=1)
write(chunk1of2, file="|object")
chunk2of2 <- glmulti(mod,
level=2,
method="h",
marginality=TRUE,
name="exhausting_glm",
chunks=2,
chunk=2)
write(chunk2of2, file="|object")
fullobj <- consensus(as.list(list.files(pattern = "exhausting_glm")),
confsetsize = NA)
summary(fullobj)$bestmodel
These will save files "exhausting_glm1.1" and "exhausting_glm1.2" in your current working directory and consensus
will grab them. Please note the as.list()
in consensus
-- this wasn't included in the vignette but I required it to prevent an error.
Let's say you ran glmulti for your circumstances with method='d' to get the diagnostics and the call reported you had 327,680 models. If you did the following, you could fit two formulae in each chunk. There are a variety of ways to distribute those chunks depending on your computing system/resources:
## try chunk of chunks, per vignette:
chunk1of2 <- glmulti(mod,
level=2,
method="h",
marginality=TRUE,
name="exhausting_glm",
chunks=327680/2,
chunk=1)
write(chunk1of2, file="|object")
chunk2of2 <- glmulti(mod,
level=2,
method="h",
marginality=TRUE,
name="exhausting_glm",
chunks=327680/2,
chunk=2)
write(chunk2of2, file="|object")
fullobj <- consensus(as.list(list.files(pattern = "exhausting_glm")),
confsetsize = NA)
## best of the 4 models fit out of the 327680 possible
summary(fullobj)$bestmodel
One thing to note with regards to scaling and breaking up big candidate sets into many smaller chunks -- I found that if method='d'
gave a large number of models, then breaking it up into chunks was still computationally intensive because each chunk had to "reinvent the wheel" and calculate all the candidate models from scratch again -- and this takes time.
Upvotes: 0
Reputation: 11738
If you read the vignette (that you can download there), you see that chunk
is determining only one part of the computation.
I think you just need to make calls from a loop with chunk in seq_len(chunks)
and combine the results.
You should email the author or open an issue for further information.
Upvotes: 1