Reputation: 2354
I have a few questions or doubts on sparkling water and why is it needed.
Lets assume that I have a generated h2o model with both binary and pojo.
Now I want to deploy the model into production and have an option for using pojo and binary (sparkling water) both.
Uses spark to run a pojo model.
Trains / Runs a model in sparkling water.
What are the advantages which sparkling water h2o provides over normal spark?
Upvotes: 8
Views: 2928
Reputation: 510
Which one should I use? Direct spark with pojo or sparkling water with Binary.
What is the exact use of sparkling water, when we can easily deploy a model using pojo and spark itself?
Is sparkling water needed only when you have to train model on huge amounts of data? Or it can be used in PROD deployments of model's as well.
If putting a model in "production" means having "always on" scoring exposed as a REST endpoint or similar: the POJO/MOJO is the way you want to go (H2O clusters are not highly available). You'll need to make sure you're handling incoming data correctly yourself though.
If you are doing batch scoring, nightly or otherwise, then it may make sense to use the binary model w/ Sparkling Water because parsing incoming data becomes trivial (asH2OFrame(..)) and scoring is easy as predict()
Upvotes: 7