Youfa Mao
Youfa Mao

Reputation: 159

can mlflow.spark's saved model loaded as Spark/Scala Pipeline?

Our algorithm engineer is developing machine learning model using pyspark & mlflow. He's trying to save the model using mlflow.spark API & the model format is the native spark MLlib format. Could the model be loaded from Spark Scala code? It seems that mlflow is quite restricted for cross-language usage.

Upvotes: 2

Views: 1057

Answers (1)

Andre
Andre

Reputation: 354

MLflow Java/Scala client does not have feature parity with MLflow Python as it is missing the concept of Projects and Models. However, you can read in a PySpark-generated Spark ML model with Scala using the downloadArtifact method.

https://mlflow.org/docs/latest/java_api/org/mlflow/tracking/MlflowClient.html#downloadArtifacts-java.lang.String-java.lang.String-

%python
mlflow.spark.log_model(model, "spark-model")

%scala
val modelPath = client.downloadArtifacts(runId, "spark-model/sparkml").getAbsolutePath
import org.apache.spark.ml.PipelineModel
val model = PipelineModel.load(modelPath)
val predictions = model.transform(data)

Upvotes: 1

Related Questions