Reputation: 139
I have xgboost
model, which was trained on pure Python
and converted to pmml
format. Now I need to use this model in PySpark
script, but I out of ideas, how can I realize it. Are there methods that allow import pmml
model in Python
and use it for predict? Thanks for any suggestions.
BR,
Vladimir
Upvotes: 1
Views: 1658
Reputation: 226
You could use PyPMML-Spark to import PMML in PySpark script, for example:
from pypmml_spark import ScoreModel
model = ScoreModel.fromFile('the/pmml/file/path')
score_df = model.transform(df)
Upvotes: 1
Reputation: 13001
Spark does not support importing from PMML directly. While I have not encountered a pyspark PMML importer there is one for java (https://github.com/jpmml/jpmml-evaluator-spark). What you can do is wrap the java (or scala) so you can access it from python (e.g. see http://aseigneurin.github.io/2016/09/01/spark-calling-scala-code-from-pyspark.html).
Upvotes: 3