Reputation: 1561
I began to work with MLeap as a serialization tool that allows to save model in Spark or scikit-learn and load it for inference using MLeap Runtime. It works well.
Now my purpose is to load a model saved using MLeap into my Java code, into my own structures, without MLeap Runtime. I investigated a bit and haven't found any "format definitions" of "schema", only examples that show how some serialized models look like. From that perspective it looks like MLeap is just a serialization/deserialization tool, not a "format" as it's declared on the main page of documentation.
So, is MLeap a "format" or just a serialization tool? Can I found a format definition or schema somewhere?
And again, my purpose is to understand if it's possible to write a custom serialization/deserialization tool for MLeap format or the only option is to use MLeap tools for that?
Upvotes: 0
Views: 333
Reputation: 380
I would say, that Mleap is a framework to put models to production without the overhead of the frameworks in which you trained them. This leads to the desired low latency. De-/Serialization is definetly an important part of that and you in fact got some freedom to store your pipelines.
I recommend having a look at the bundles you create (zip files) using Mleap which contain the exported pipelines. Most of the serialisations are easy to comprehend: a logistic regression is contained in a jsonfile for example that has the identifier of the pipeline element and the coefficients. Basically what defines the logistic regression model.
Upvotes: 0