Reputation: 2416
I want to save a random forest regression model in PMML from R, and load it in Spark (Scala or Java). Unfortunately I have issues in the second step.
A minimal example of saving a PMML of a random forest regresion model in R is provided below.
When I try to load this model from Scala or Java using jpmml (see code below), I get the following error:
Exception in thread "main" java.lang.IllegalArgumentException: http://www.dmg.org/PMML-4_3
I can overcome this error editing the xml file: the attribute "xmlns" in the tag "PMML" contains the url that appears in the error message. If I remove completely the url or I change 4_3 to 4_2, this error disappears. However, a new error message appears:
Exception in thread "main" org.jpmml.evaluator.UnsupportedFeatureException (at or around line 19): MiningModel
Do you have please any suggestions or ideas on how to solve this specific error or, more in general, how to load in Scala a pmml created in R?
Thank you!
Update: The problem, as answered by @user1808924, was the version of the jpmml library. The code quoted below now works fine. The correct libs should be loaded, for example using the Maven Central Repository:
<dependency>
<groupId>org.jpmml</groupId>
<artifactId>pmml-evaluator</artifactId>
<version>1.3.6</version>
</dependency>
<dependency>
<groupId>org.jpmml</groupId>
<artifactId>pmml-model</artifactId>
<version>1.3.7</version>
</dependency>
<dependency>
<groupId>org.jpmml</groupId>
<artifactId>pmml-spark</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
Minimal example of saving a PMML of a random forest regresion model in R:
library(randomForest)
library(r2pmml)
data(mtcars)
MPGmodel.rf <- randomForest(mpg~., mtcars, ntree=5, do.trace=1)
# with package "r2pmml", convert model to pmml version 4.3 and save to xml:
r2pmml(MPGmodel.rf, "MPGmodel-r2pmml.pmml")
Loading the model in Scala:
import java.io.File
import org.jpmml.evaluator.Evaluator
import org.jpmml.spark.EvaluatorUtil
val fileNamePmml = "MPGmodel-r2pmml.pmml"
val pmmlFile = new File(fileNamePmml)
// the "UnsupportedFeature MiningModel" error appears here:
val myEvaluator: Evaluator = EvaluatorUtil.createEvaluator(pmmlFile)
I've also tried to load the model using Java, with identical error messages:
import org.dmg.pmml.PMML;
import org.jpmml.evaluator.ModelEvaluator;
import org.jpmml.evaluator.ModelEvaluatorFactory;
import java.io.*;
import java.util.Scanner;
import java.io.ByteArrayInputStream;
File pmmlFile = new File(fileNamePmml );
// the pmml file is successfully loaded as a string:
String pmmlString = null;
pmmlString = new Scanner(pmmlFile).useDelimiter("FILEFINISHESHERE").next();
// a PMML object is successfully created from the pmml string:
PMML myPmml = null;
try(InputStream is = new ByteArrayInputStream(pmmlString.getBytes())){
myPmml = org.jpmml.model.PMMLUtil.unmarshal(is);
}
// the "UnsupportedFeature MiningModel" error appears here:
ModelEvaluatorFactory modelEvaluatorFactory = ModelEvaluatorFactory.newInstance();
ModelEvaluator<?> modelEvaluator = modelEvaluatorFactory.newModelEvaluator(myPmml);
Upvotes: 3
Views: 2197
Reputation: 226
You could use PMML4S to load the PMML model in Scala, for example:
import org.pmml4s.model.Model
val model = Model.fromFile("MPGmodel-r2pmml.pmml")
val result = model.predict(data)
The input data
could be a map, a list of pairs of keys and values, an array, json, or PMML4S's Series.
Upvotes: 1
Reputation: 4926
You're using a legacy JPMML library, which was discontinued 3+ years ago. Naturally, it doesn't support new PMML features (such as PMML 4.2 and 4.3 schemas) that have been added since then.
Simply upgrade to the JPMML-Evaluator library. As a bonus, your code will be much shorter and cleaner.
Upvotes: 2