Reputation: 48436
How do I set up a TensorFlow in the Google cloud? I understand how to create a Google Compute Engine instance, and how to run TensorFlow locally; and a recent Google blog post suggests that there ought to be a way to create a Google Compute Engine instance and run TensorFlow applications in the cloud:
Machine Learning projects can come in many sizes, and as we’ve seen with our open source offering TensorFlow, projects often need to scale up. Some small tasks are best handled with a local solution running on one’s desktop, while large scale applications require both the scale and dependability of a hosted solution. Google Cloud Machine Learning aims to support the full range and provide a seamless transition from local to cloud environment.
Even if I'm reading a bit much into this, it has to be the case, given what competing platforms such as Microsoft's Azure offer, that there's a way to set up TensorFlow applications (developed locally and "seamlessly" scaled up into the cloud, presumably using GPUs) in the Google cloud.
For example, I'd like to work locally in my IDE tuning the features and code for my project, running limited training and validation there, and push the code periodically to the cloud to run train there with (arbitrarily) greater resources, and then save and download the trained model. Or perhaps even better, just run the graphs (or parts of graphs) in the cloud using tunable resources.
Is there a way to do this; is one planned? How do I set up TensorFlow in the Google cloud?
Upvotes: 16
Views: 12553
Reputation: 10463
Jan 2025 - initially google had linux images with preinstalled tensorflow. Now it is not that simple. When you create a larger GPU machines like A2/A3 the images are with Debian and preinstalled things like conda, scikit etc but no tensorflow.
And you cannot install tensorflow globally as it would break many Debian dependencies. I tried to install tensorflow by simply pip install tensorflow
and it started to uninstall pre-installed packages like numpy and broke down in the middle.
So the only way currently, on A2/A3 machine or higher - is to make conda (pre-installed) environment and install everything there, independently from the pre-installed packages, which kinda defies the purpose of using image with everything preinstalled that nothing you can use..
Also a consideration that once you start installing packages yourself - you might get spend an hour or tow and a high spec GPU VM can easily cost a $100 per hr.. But it is still cheaper than buying and running an H100 locally.
However there is another option - IF you ok with 16Gb GPUs. You can use "click to deploy":
https://console.cloud.google.com/marketplace/details/click-to-deploy-images/deeplearning
they offer 16Gbs GPUs on N1 machines with auto install of Nvidia drivers and you can choose a Tensorflow version, so you get everything in one go.. And 16Gb GPUs hourly rates are a lot cheaper than 40Gb cards.
Upvotes: 0
Reputation: 802
Depending on the use case, there can be multiple ways. At the moment the following two methods come to my mind:
1) Select Project/ Computer Engine/ VM instances/ create VM instance. Then go to VM instances, Check the Instance/ click on SSH (needs "gcloud" )/ copy the command and run in cloud shell. Now you are in a virtual machine of your own. Install pip3 here. Install tensorflow (cpu or gpu version). and use it :)
Currently, google cloud is supporting tensorflow version <= 1.4.
If you are interested in using tensorflow-gpu==2.0, you can use google Cloud Funcitons at https://cloud.google.com/blog/products/ai-machine-learning/how-to-serve-deep-learning-models-using-tensorflow-2-0-with-cloud-functions
2) You can use google cloud AI Platform https://cloud.google.com/ml-engine/docs/packaging-trainer
It is also supporting tensorflow version <= 1.4 at the moment.
Upvotes: 0
Reputation: 18200
As described on the Kubernetes blog, you can run TensorFlow on Kubernetes. It links to "a step-by-step tutorial that shows you how to create the TensorFlow Serving Docker container to serve the Inception-v3 image classification model", which you should be able to adapt to running your own TensorFlow workload. You can use Google Container Engine to run Kubernetes on Google's cloud.
Or, as Aaron mentioned, you can try to sign up for early access to Google's CloudML product.
Upvotes: 1
Reputation: 1812
One of the most straightforward ways to work with TensorFlow on the Google Cloud Platform, using TPU acceleration, is to use the ctpu
command:
https://cloud.google.com/tpu/docs/quickstart
This will create everything you need and log you into a VM from where you can run your TensorFlow programs.
There is more information here on how to run ctpu
from your desktop, if you want to avoid using the Google cloud shell:
https://github.com/tensorflow/tpu/tree/master/tools/ctpu
Upvotes: 1
Reputation: 4166
To run TensorFlow on Google Cloud, in order of preference:
(1) Use Cloud ML Engine. This is a fully managed service and supports both training and serving. You can choose between CPU, GPU and TPU.
(2) Use Deep Learning VM, which is a Google Compute Engine instance with TensorFlow already installed: https://cloud.google.com/deep-learning-vm/docs/ -- you can add GPUs to this instance.
(3) Use Kubeflow on GKE (TensorFlow on Kubernetes).
Upvotes: 0
Reputation: 2364
This is still in limited preview. The best you can do is sign up and hope that they select you to be part of the preview.
Edit: CloudML is now in public beta so anyone can use it without signing up and requesting access. We hope you give it a try! We have a tag for questions: google-cloud-ml.
Upvotes: 6
Reputation: 48310
I would suggest you to follow this tutorial that guides you step by step:
https://www.youtube.com/watch?v=N422_CYuzZg
Here is the main article to set up the account etc.
https://cloud.google.com/solutions/machine-learning-with-financial-time-series-data
Upvotes: 2