sogipec
sogipec

Reputation: 1

Incorrect Predictions after exporting file to PMML from R to Python

I have built a gbm classifier on R using the library gbm.

gbm2<-gbm(deal_stage~.,data=train,train.fraction=1,
          interaction.depth=4,shrinkage=.001,
          n.trees=6000,bag.fraction=0.5,cv.folds=5,
          distribution="bernoulli",verbose=T)

r2pmml(gbm2,"/gbm_test.pmml",compact=TRUE)

Then on Python, when I try to do predictions from the PMML file, I get different results than what I had on R.

from pypmml import Model
model = Model.fromFile('gbm_test.pmml')
model.predict(observation)

Overall, I get a different accuracy on the train and on the test set for both models. My dataset contains integer, and string features. And there are missing values for some fields, which should normally be handled by the classifier.

I would greatly appreciate an advice to see what should I change to make my predictions on Python coincide with what I observe on R! Thanks!

Upvotes: 0

Views: 493

Answers (1)

user1808924
user1808924

Reputation: 4926

If you're using JPMML tools for converting models to PMML files, then you should be also using JPMML evaluators for scoring these PMML files. The JPMML software project has extensive integration test coverage, which covers the whole pipeline.

Right now, will you be getting the correct predictions if you switch from PyPMML to JPMML-Evaluator-Python?

Upvotes: 0

Related Questions