Reputation: 21
In fact, there is a similar question and answer, but it does not work me. see below. The trick lies in rewrite fit of lmFunc
.
"Error in { : task 1 failed - "Results do not have equal lengths", many warning:glm.fit: fitted probabilities numerically 0 or 1 occurred"
where is the fault?
lmFuncs$fit=function (x, y, first, last, ...)
{
tmp <- as.data.frame(x)
tmp$y <- y
glm(y ~ ., data = tmp, family=binomial(link='logit'))
}
ctrl <- rfeControl(functions = lmFuncs,method = 'cv',number=10)
fit.rfe=rfe(df.preds,df.depend, rfeControl=ctrl)
And in the rfeControl help, it is said the parameter 'functions' that can be used with caret’s train function (caretFuncs). What does it really mean? Any details and example? Thanks
Upvotes: 2
Views: 2660
Reputation: 135
I was having a similar issue with customising lmFunc.
For logistic regression make sure you use lrFuncs and set size equal to the number of predictor variables. This leads to no issues.
Example (for functionality purposes only)
library(caret)
#Reproducible data
set.seed(1)
x <- data.frame(runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10),runif(10))
x$dpen <- sample(c(0,1), replace=TRUE, size=10)
x$dpen <- factor(x$dpen)
#Spliting training set into two parts based on outcome: 80% and 20%
index <- createDataPartition(x$dpen, p=0.8, list=FALSE)
trainSet <- x[ index,]
testSet <- x[-index,]
control <- rfeControl(functions = lrFuncs,
method = "cv", #cross validation
verbose = FALSE, #prevents copious amounts of output from being produced.
)
##RFE
rfe(trainSet[,1:28] #predictor varia,
trainSet[,9],
sizes = c(1:28) #size of predictor variables,
rfeControl = control)
Upvotes: 1