Reputation: 1
I have built a gbm classifier on R using the library gbm.
gbm2<-gbm(deal_stage~.,data=train,train.fraction=1,
interaction.depth=4,shrinkage=.001,
n.trees=6000,bag.fraction=0.5,cv.folds=5,
distribution="bernoulli",verbose=T)
r2pmml(gbm2,"/gbm_test.pmml",compact=TRUE)
Then on Python, when I try to do predictions from the PMML file, I get different results than what I had on R.
from pypmml import Model
model = Model.fromFile('gbm_test.pmml')
model.predict(observation)
Overall, I get a different accuracy on the train and on the test set for both models. My dataset contains integer, and string features. And there are missing values for some fields, which should normally be handled by the classifier.
I would greatly appreciate an advice to see what should I change to make my predictions on Python coincide with what I observe on R! Thanks!
Upvotes: 0
Views: 493
Reputation: 4926
If you're using JPMML tools for converting models to PMML files, then you should be also using JPMML evaluators for scoring these PMML files. The JPMML software project has extensive integration test coverage, which covers the whole pipeline.
Right now, will you be getting the correct predictions if you switch from PyPMML to JPMML-Evaluator-Python?
Upvotes: 0