Reputation: 4849
I'm having and error while trying to train a dataset with the caret package. The error is the following... Error in train.default(x, y, weights = w, ...) : Stopping
. I also have warnings()
which all of them are the same because I'm creating an object for the tuneGrid
with the following code...grid <- expand.grid(cp = seq(0, 0.05, 0.005))
. This code is creating a data.frame
with 11 rows that correspond to the 11 warnings I'm having. Here is the warning... In eval(expr, envir, enclos) :
model fit failed for Fold01: cp=0 Error in
[.data.frame(m, labs) : undefined columns selected
. Looks like the cp
doesn't have anything. I can go to my environment and see the grid object and all 11 rows. I have search stackoverflow and I found similar questions but since these functions have so many ways to tweak them, I haven't found a question that fix my problem.
Here is my code...
require(rpart)
require(rattle)
require(rpart.plot)
require(caret)
setwd('~/Documents/Lipscomb/predictive_analytics/class4/')
data <- read.csv(file = 'data.csv',
head = FALSE)
data <- subset(data, select = -V1)
colnames(data) <- c('diagnostic', 'm.radius', 'm.texture', 'm. perimeter', 'm.area', 'm.smoothness', 'm.compactness', 'm.concavity', 'm.concave.points', 'm.symmetry', 'm.fractal.dimension',
'se.radius', 'se.texture', 'se. perimeter', 'se.area', 'se.smoothness', 'se.copactness', 'se.concavity', 'se.concave.points', 'se.symmetry', 'se.fractal.dimension',
'w.radius', 'w.texture', 'w. perimeter', 'w.area', 'w.smoothness', 'w.copactness', 'w.concavity', 'w.concave.points', 'w.symmetry', 'w.fractal.dimension')
str(data)
set.seed(7)
sample.train <- sample(1:nrow(data), nrow(data) * .8)
sample.test <- setdiff(1:nrow(data), sample.train)
data.train <- data[sample.train, ]
data.test <- subset(data[sample.test, ], select = -diagnostic)
rpart.tree <- rpart(diagnostic ~ ., data = data.train)
out <- predict(rpart.tree, data.test, type = 'class')
table(out, data[sample.test, ]$diagnostic)
fancyRpartPlot(rpart.tree)
temp <- rpart.control(xval = 10, minbucket = 2, minsplit = 4, cp = 0)
dfit <- rpart(diagnostic ~ ., data = data.train, control = temp)
fancyRpartPlot(dfit)
fit.control <- trainControl(method = 'cv', number = 10)
grid <- expand.grid(cp = seq(0, 0.05, 0.005))
trained.tree <- train(diagnostic ~ ., method = 'rpart', data = data.train,
metric = 'Accuracy', maximize = TRUE,
trControl = fit.control, tuneGrid = grid)
Upvotes: 0
Views: 1537
Reputation: 4849
I have found a solution to this problem. I changed the way I was naming my colnames
. For some reason, the original code for naming colnames
was causing error utilizing the train
function. This code fixed the problem.
colnames(data) <- c('diagnostic', 'radius', 'texture', 'perimeter', 'area', 'smoothness', 'compactness', 'concavity', 'concavePoints', 'symmetry', 'fractalDimension',
'SeRadius', 'SeTexture', 'SePerimeter', 'SeArea', 'SeSmoothness', 'SeCopactness', 'SeConcavity', 'SeConcavePoints', 'SeSymmetry', 'SeFractalDimension',
'Wradius', 'Wtexture', 'Wperimeter', 'Warea', 'Wsmoothness', 'Wcopactness', 'Wconcavity', 'WconcavePoints', 'Wsymmetry', 'WfractalDimension')
Upvotes: 1