Qululu
Qululu

Reputation: 1080

Apache Spark MLlib: How to import model from PMML

I have a PMML file which encodes a logistic regression model that was NOT exported from MLlib.

How can I import the model from PMML using MLlib in Java for evaluation/prediction?

(I know that MLlib can export to PMML, but I need to import from PMML)

Upvotes: 8

Views: 5224

Answers (3)

PredictFuture
PredictFuture

Reputation: 226

You could use PMML4S-Spark to import PMML as a SparkML transformer, then make predictions/evaluations in Scala, for example:

import org.pmml4s.spark.ScoreModel

val model = ScoreModel.fromFile("the/pmml/model/path")
val scoreDf = model.transform(df)

If you use PySpark, you could use PyPMML-Spark, for example:

from pypmml_spark import ScoreModel

model = ScoreModel.fromFile('the/pmml/model/path')
score_df = model.transform(df)

Upvotes: 1

tjb305
tjb305

Reputation: 2630

Have you considered using a PMML loader such as jpmml-spark? You can have interoperability issues depending on where you built the model and which pmml exporter you used. I believe sklearn2pmml is based on jpmml library so you should have good interoperability if you use those in combination.

Upvotes: 0

user1808924
user1808924

Reputation: 4926

To import, you need to perform PMML export operations in the reverse order:

  1. Extract the intercept and feature coefficients from PMML's RegressionModel/RegressionTable element.
  2. Instantiate Spark ML's LogisticRegressionModel object using those values.

This is my second time posting this answer. I wonder why the first answer was deleted (without any discussion/explanation)?

Upvotes: 0

Related Questions