Reputation: 73
I am newbies in using R for Data Mining and Machine Learning. While I am studying about Naive Bayes Classified, I come across this error:
"Error in which((sapply(newdata[ind_factor], nlevels) != sapply(tables[ind_factor], : (list) object cannot be coerced to type 'integer'"
This is my code:
data <- read.csv(file.choose(),header = T)
str(data)
set.seed(1234)
splitData <- sample(2,nrow(data),replace = T,prob = c(0.8,0.2))
train<-data[splitData == 1,]
test <- data[splitData == 2,]
mdl <- naive_bayes(admit ~ .,data = train)
predicted <- predict(mdl, train, type = 'prob')
When I run the final line, it throws the error message above. Can anyone help me please! Thanks a lot.
Upvotes: 3
Views: 14793
Reputation: 398
It looks like one of your independent variables is a string or factor variable, and all need to be numeric. See my toy dataset below. I get the same error when including all the variables; however, when I take var4 out (where variables are strings), it works).
If you want to use the variable, you could convert the string variable to a factor, then convert to the factor to a numeric variable (which will capture the underlying values of the factor).
library(naivebayes)
#data <- read.csv(file.choose(),header = T)
data <- data.frame(admit = sample(100, x=c(F,T), prob=c(.5,.5), replace=T),
var1 = sample(100, x=1:4, replace=T),
var2 = sample(100, x=1:3, replace=T),
var3 = sample(100, x=1:3, replace=T),
var4 = sample(100, x=c("s1", "s2"), replace=T))
str(data)
set.seed(1234)
splitData <- sample(2,nrow(data),replace = T,prob = c(0.8,0.2))
train<-data[splitData == 1,]
test <- data[splitData == 2,]
# Doesn't work
mdl <- naive_bayes(admit ~ .,data = train)
predicted <- predict(mdl, train, type = 'prob')
# Works
mdl <- naive_bayes(admit ~ var1 + var2 + var3,data = train)
predicted <- predict(mdl, train, type = 'prob')
# Convert string to factor then numeric
train$var4 <- as.numeric(as.factor(train$var4))
mdl <- naive_bayes(admit ~ .,data = train)
predicted <- predict(mdl, train, type = 'prob')
Upvotes: 4