Reputation: 400
I am learning GCP, and came across Kuberflow and Google Cloud Composer.
From what I have understood, it seems that both are used to orchestrate workflows, empowering the user to schedule and monitor pipelines in the GCP.
The only difference that I could figure out is that Kuberflow deploys and monitors Machine Learning models. Am I correct? In that case, since Machine Learning models are also objects, can't we orchestrate them using Cloud Composer? How does Kubeflow help in any way, better than Cloud Composer when it comes to managing Machine Learning models??
Thanks
Upvotes: 5
Views: 3430
Reputation: 6787
Kubeflow and Kubeflow Pipelines
Kubeflow is not exactly the same as Kubeflow Pipelines. The Kubeflow project mostly develops Kubernetes operators for distributed ML training (TFJob, PyTorchJob). On the other hand the Pipelines project develops a system for authoring and running pipelines on Kubernetes. KFP also has some sample components, by the main product is the pipeline authoring SDK and the pipeline execution engine
Kubeflow Pipelines vs. Cloud Composer
The projects are pretty similar, but there are differences:
Upvotes: 7
Reputation: 2099
Both services run on Kubernetes, but they are based on different programming frameworks; therefore, you are correct, Kuberflow deploys and monitors Machine Learning models. See below the answer for your questions:
You would need to find an operator that meet your needs, or create a custom operator with the structure required to create a model, see this example. Even when it can be performed, this could be more difficult that using Kubeflow.
Kubeflow hides complexity as it is focused on Machine Learninig models. The frameworks specialized on machine learning makes those things easier than using Cloud Composer which in this context can be considered as a general purpose tool (focused on linking existing services supported by the Airflow Operators).
Upvotes: 5
Reputation: 1994
Taking this straight from kubeflow.org
The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. Our goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow.
And as you can see it is a suite made of many software that are useful in the life cycle of a ML model. It comes with tensorflow, jupiter, etc. Now the real deal, when it comes to Kubeflow is "easy deploy of a ML model at scale on a Kubernetis cluster".
However on GCP you already a ML suite in cloud, datalab, cloud build etc. So I don't know how much efficient will be sinning up a kubernetis cluster if you don't need the "portability" factor.
Cloud Composer is the real deal while taking about orchestration of a workflow. It is a "managed" version of Apache Airflow and it is ideal for any "simple" workflow that changes a lot, since you can change it via a visual UI and with python.
It is also ideal to automate infrastructure operations:
Upvotes: 5