Reputation: 1
I am trying to use ROSE to help with an imbalanced dataset. I am about 90% there, but I am having trouble with my ovun.sample code. When I run the ovun.sample code, it does not create a "over", "under" or "both" dataset, the values are showing in R as NULL (empty), rather than as data. I would appreciate any trouble shooting on how this can be fixed!
set.seed(123)
ind <- sample(2, nrow(credit), replace = TRUE, prob = c(0.7, 0.3))
train <- credit[ind==1,]
test <- credit[ind==2,]
# Data for Developing the Predictive Model
table(train$DEFAULT)
prop.table(table(train$DEFAULT))
summary(train)
# Sample balancing (Over-, Under-, Both)
library(ROSE)
over <- ovun.sample(DEFAULT~., data = train, method = "over", N = 996)$credit
table(over$DEFAULT)
under <- ovun.sample(DEFAULT~., data = train, method = "under", N = 414)$credit
table(under$DEFAULT)
both <- ovun.sample(DEFAULT~., data = train, method = "both",
p = 0.5, seed = 213, N = 705)$credit
table(both$DEFAULT)
# Predictive Model (Random Forest)
library (randomForest)
rftrain <- randomForest(DEFAULT~., data = train,
ntree = 500, mtry = 10)
rfover <- randomForest(DEFAULT~., data = over,
ntree = 500, mtry = 10)
rfunder <- randomForest(DEFAULT~., data = under,
ntree = 500, mtry = 10)
rfboth <- randomForest(DEFAULT~., data = both,
ntree = 500, mtry = 10)
# Predictive Model Evaluation with test Data
library(caret)
confusionMatrix(predict(rftrain, test), test$DEFAULT, positive = '1')
confusionMatrix(predict(rfover, test), test$DEFAULT, positive = '1')
confusionMatrix(predict(rfunder, test), test$DEFAULT, positive = '1')
confusionMatrix(predict(rfboth, test), test$DEFAULT, positive = '1')```
Upvotes: 0
Views: 1514
Reputation: 31
I had the same issue. Try changing the code as follows:
over <- ovun.sample(DEFAULT~., data = train, method = "over", N = 996)$data
Similarly change it for the rest. It works.
Upvotes: 1