Pablo
Pablo

Reputation: 733

caret::train pass extra parameters rpart

I'm building a decision tree with rpart via the caret::train function. What I'm trying to do is to set the minsplit parameter of rpart equal to 1, in order to prune it afterwards with the cp. What I get from here is that the parameters should be passed in the ... of the train function. But this doesn't work. A minimal reproducible example:

mod1 <- train(Species ~ ., iris, method = "rpart", tuneGrid = expand.grid(cp = 0), minsplit=1)
mod2 <- rpart(Species ~ ., iris, cp=0, minsplit=1)

What I get is that mod1$finalModel and mod2 are quite different. I would like that mod1$finalModel was like mod2 (i.e., totally overfitted). I cannot pass the parameter either on the tuneGrid since it only accepts a cp column.

So my question is: is there anyway in caret to pass the argument minsplit=1 in the train function and then cross validate over the cp parameter?

Upvotes: 3

Views: 4200

Answers (2)

Gregor Kvas
Gregor Kvas

Reputation: 161

I suppose that ‘control = rpart.control()’ is necessary for passing the arguments ‘minsplit’ and ‘minbucket’ within {caret} train-function as this would be the correct way in the rpart-function itself, to which the arguments are sent via ‘...’ of the {caret} train-function. Best, G

Upvotes: 0

Pablo
Pablo

Reputation: 733

Ok, thanks to this post I figured out how to do it:

mod1 <- train(Species ~ ., iris, method = "rpart", 
             control = rpart.control(minsplit = 1, minbucket = 1))

I'm still not quite sure why the argument has to be passed via control = rpart.control(). Passing just the arguments minsplit = 1, minbucket = 1 directly to the train function simply doesn't work.

Upvotes: 6

Related Questions