kepung
kepung

Reputation: 2132

R - Getting Non-tree model detected! This function can only be used with tree models

I'm a newbie in R. When I tried running xgb.importance, I am getting this

"Error in xgb.model.dt.tree(feature_names = feature_names, text = text) : 
  Non-tree model detected! This function can only be used with tree models".

Any help would be greatly appreciated.

require(xgboost)

require(Matrix)

require(data.table)

if (!require('vcd')) install.packages('vcd')

a = data.frame(id=c(1,2,3,4,5), smoke=c('Yes','No','Yes', 'Yes', 'Yes'), sugar=c('Yes','No','Yes', 'Yes','Yes'), sex=c('M','F','F', 'M','F'), diseased=c('Yes','No','Yes', 'Yes','Yes'), age=c(20,21,45, 45, 40))

d <- data.table(a, keep.rownames = F)

head(d[,AgeDiscret := as.factor(round(age/10,0))])

head(d[,AgeCat:= as.factor(ifelse(age > 30, "Old", "Young"))])

s <- sparse.model.matrix(age~.-1, data = d)

ov = d[,diseased] == 'Yes'

mdl <- xgboost(data = s, label = ov, max_depth = 4, eta = 1, nthread = 2, nrounds = 10,objective = "binary:logistic")


importance <- xgb.importance(feature_names = colnames(s), model = mdl) #<-- error message 

Upvotes: 4

Views: 2888

Answers (3)

KasperGL
KasperGL

Reputation: 617

I experienced this error when the model argument was a model trained on data that had perfect collinearity between all predictor variables and a target that was consistently alternating between 0 and 1 (i.e. 0,1,0,1,0,1,0,1,0).

The data I used was test data generated for the purpose of unit testing part of a package. I solved the problem by generating data using rnorm() for the predictor variables and sample() for the target variable.

Upvotes: 1

Ivan Lee
Ivan Lee

Reputation: 4311

You should add parameter "tree_method", like:

mdl <- xgboost(data = s, tree_method = 'gpu_hist', label = ov, max_depth = 4, eta = 1, nthread = 2, nrounds = 10,objective = "binary:logistic")

Upvotes: 1

kepung
kepung

Reputation: 2132

Seems like there's 2 problems here.

  1. Data size not large enough.
  2. There's not much variation in my data - e.g. smoke section i have 'yes' and 'no'. I updated with 'yes', 'no' and 'casual'.

Then i ran with the following codes successfully.

a = data.frame(id=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17, 18, 19,20), smoked=c('Yes','No','Casual', 'Casual', 'Casual','Yes', 'Yes', 'Yes','Yes', 'Yes', 'Yes','Yes', 'Yes', 'Yes','Yes','Yes','Yes', 'Yes', 'Casual','Casual'), highIntakeSugar=c('Yes','No','Yes', 'Yes','Yes', 'Yes', 'Yes','Yes','Yes', 'Yes','Yes','Yes', 'Yes','Yes','Yes', 'Yes','Yes', 'Yes', 'Yes','Yes'), sex=c('M','F','F', 'M','F','F', 'M','F','F', 'M','F','F', 'M','F','F','F','F', 'M','F','F'), disease=c('Yes','No','Unknown','Unknown','Yes','Unknown', 'Unknown','Yes','Yes', 'Yes','Yes','Yes', 'Yes', 'Yes','Yes', 'Yes','Yes', 'Yes', 'Yes','Yes'), age=c(20,21,45, 45, 40,45, 35, 40,45, 45, 40,45, 45,40,45,40,45,45,40,45))

d <- data.table(a, keep.rownames = F)

d[,id:=NULL]
s <- sparse.model.matrix(age~.-1, data = d)

ov = d[,disease] == 'Yes'
mdl <- xgboost(data = s, label = ov, max_depth = 4, eta = 1, nthread = 2, nrounds = 10,objective = "binary:logistic")
xgb.importance(feature_names = colnames(s), model = mdl)

Upvotes: 1

Related Questions