Unable to run transform in Mleap runtime from Spark model

Question

I'm currently testing the Mleap solution in order to perform prediction on Spark model. In order to do that, I've first implemented the Spark example for linear regression as described here: https://spark.apache.org/docs/2.3.0/ml-classification-regression.html#linear-regression I've been able to save the model in a Mleap bundle and reuse in another Spark context. Now, I'd like to use this bundle in a Mleap runtime but I'm facing some casting issues that keeps it from working correctly

The error comes from the schema definition:

       val dataSchema = StructType(Seq(
                          StructField("label", ScalarType.Double),
                          StructField("features", ListType.Double)
                        )).get

The "features" part is a set of columns that are grouped. I've tried many things, but no luck:

                          StructField("label", ScalarType.Double),
                          StructField("features", ListType.Double)
                        )).get

=> this gives me

java.lang.IllegalArgumentException: Cannot cast ListType(double,true) to TensorType(double,Some(WrappedArray(10)),true)

So I tried:

       val dataSchema = StructType(Seq(
                          StructField("label", ScalarType.Double),
                          StructField("features", TensorType.Double(10))
                        )).get

but it gave me

java.lang.ClassCastException: scala.collection.immutable.$colon$colon cannot be cast to ml.combust.mleap.tensor.Tensor

Here is the whole piece of code:

    val dataSchema = StructType(Seq(
                  StructField("label", ScalarType.Double),
                  StructField("features", TensorType.Double(10))
               )).get
    val data = Seq(Row(-9.490009878824548, Seq(0.4551273600657362, 0.36644694351969087, -0.38256108933468047, -0.4458430198517267, 0.33109790358914726,0.8067445293443565, -0.2624341731773887,-0.44850386111659524,-0.07269284838169332, 0.5658035575800715)))

    val bundle = (for(bundleFile <- managed(BundleFile("jar:file:/tmp/spark-lrModel.zip"))) yield {
          bundleFile.loadMleapBundle().get
    }).tried.get

   var model = bundle.root
   val to_test = DefaultLeapFrame(dataSchema, data)
   val res = model.transform(to_test).get // => Here is the place which raises the exception

I'm a little bit lost now with this Type mapping. Any idea?

Thanks,

Stéphane

Unable to run transform in Mleap runtime from Spark model

Answers (1)

Related Questions