Reputation: 2247
I am new in R and learning ml using caret
. I was working on UCI bank marketing response data but used iris
data here for reproducibility.
Issue is that I am getting error
on running vif
from car package
on classification
models.
library(tidyverse)
library(caret)
library(car)
iris
# to make it binary classification
iris_train <- iris %>% filter(Species %in% c("setosa","versicolor"))
iris_train$Species <- factor(iris_train$Species)
Creating Model
model_iris3 <- train(Species ~ .,
data = iris_train,
method = "gbm",
verbose = FALSE
# tuneLength = 5,
# metric = "Spec",
# trControl = fitCtrl
)
Error in vif
# vif
car::vif(model_iris3)
Error in UseMethod("vcov") : no applicable method for 'vcov' applied to an object of class "c('train', 'train.formula')"
I got to know about using finalModel
for vif from this SO post: Variance inflation VIF for glm caret model in R
But still getting an error
car::vif(model_iris3$finalModel)
Error in UseMethod("vcov") : no applicable method for 'vcov' applied to an object of class "gbm"
same error I get with adaboost
, earth
etc.
Appreciate any help or suggestions to solve this issue.
(UPDATE)
Finally this worked (see the complete solution in Answers
if you still get an error):
vif
doesn't work on classification
models so convert dependent
variable to numeric
and run linear regression
on it and then vif
model_iris4 <- train(as.numeric(Species) ~ .,
data = iris_train,
method = "lm",
verbose = FALSE
# tuneLength = 5,
# metric = "Spec",
# trControl = fitCtrl
)
car::vif(model_iris4$finalModel)
######## output ##########
Sepal.Length Sepal.Width Petal.Length Petal.Width
4.803414 2.594389 36.246326 25.421395
Upvotes: 1
Views: 3334
Reputation: 2247
Finally this worked:
vif
doesn't work on classification
models so convert dependent
variable to numeric
and run linear regression
on it and then vif
model_iris4 <- train(as.numeric(Species) ~ .,
data = iris_train,
method = "lm",
verbose = FALSE
# tuneLength = 5,
# metric = "Spec",
# trControl = fitCtrl
)
car::vif(model_iris4$finalModel)
######## output ##########
Sepal.Length Sepal.Width Petal.Length Petal.Width
4.803414 2.594389 36.246326 25.421395
There are high chances that if you have dummies in model than it may still give error.
For example: After following above steps I got new error on my original UCI banking dataset: Error in vif.default(model_vif_check$finalModel) : there are aliased coefficients in the model
To solve this error you can try below steps
run alias()
on model
where predicted
variable is numeric
alias_res <- alias(
lm( as.numeric(y) ~ duration+nr.employed+euribor3m+pdays+emp.var.rate+poutcome.success+month.mar+cons.conf.idx+contact.telephone+contact.cellular+previous+age+cons.price.idx+month.jun+job.retired, data = train )
)
alias_res
ld.vars <- attributes(alias_res$Complete)$dimnames[[1]]
ld.v
this will return an alias that was causing error, so just remove that predictor from the model and run model
again (in my case it was "contact.cellular"
)
model_vif_check_aliased <- train(as.numeric(pull(y)) ~ duration+nr.employed+euribor3m+pdays+emp.var.rate+poutcome.success+month.mar+cons.conf.idx+contact.telephone+previous+age+cons.price.idx+month.jun+job.retired,
data = train,
method = "lm"
)
model_vif_check_aliased
Now run vif
vif_values <- car::vif(model_vif_check_aliased$finalModel)
vif_values
duration nr.employed euribor3m pdays 1.016706 75.587546 80.930134 10.216410 emp.var.rate poutcome.success month.mar cons.conf.idx 64.542469 9.190354 1.077018 3.972748 contact.telephone previous age cons.price.idx 2.091533 1.850089 1.185461 28.614339 month.jun job.retired 3.936681 1.198350
Upvotes: 2
Reputation: 851
car::vif
is a function that needs to be adapted for each type of model. It works in the linked question because car::vif
has been implemented to cope with glm
models. car::vif
does not support your chosen model type: gbm
.
Upvotes: 1