lixinso
lixinso

Reputation: 793

Will Google Cloud Run support GPU/TPU some day?

So far Google Cloud Run support CPU. Is there any plan to support GPU? It would be super cool if GPU available, then I can demo the DL project without really running a super expensive GPU instance.

Upvotes: 19

Views: 9650

Answers (4)

hiroshi
hiroshi

Reputation: 7251

https://cloud.google.com/sdk/docs/release-notes#48800_2024-08-13

Cloud Run
  • Added --gpu and --gpu-type to gcloud beta run deploy and gcloud beta run services update which allow deploying a service with GPU.

https://cloud.google.com/run/docs/configuring/services/gpu

Upvotes: 3

Jay Smith
Jay Smith

Reputation: 71

We do now. We just introduced Cloud Run GPUs. Today we only support NVIDIA L4s but plan to support more

https://cloud.google.com/blog/products/application-development/run-your-ai-inference-applications-on-cloud-run-with-nvidia-gpus

Upvotes: 4

Reza
Reza

Reputation: 19913

You can use GPU with Cloud Run for Anthos

https://cloud.google.com/anthos/run/docs/configuring/compute-power-gpu

Upvotes: 10

John Hanley
John Hanley

Reputation: 81424

So far Google Cloud Run support CPU. Is there any plan to support GPU? It would be super cool if GPU available, then I can demo the DL project without really running a super expensive GPU instance.

I seriously doubt it. GPU/TPUs are specialized hardware. Cloud Run is a managed container service that:

  1. Enables you to run stateless containers that are invokable via HTTP requests. This means that CPU intensive applications are not supported. Inbetween HTTP request/response the CPU is idled to near zero. Your expensive GPU/TPUs would sit idle.
  2. Autoscales based upon the number of requests per second. Launching 10,000 instances in seconds is easy to achieve. Imagine the billing support nightmare for Google if customers could launch that many GPU/TPUs and the size of the bills.
  3. Is billed in 100 ms time intervals. Most requests fit into a few hundred milliseconds of execution. This is not a good execution or business model for CPU/GPU/TPU integration.
  4. Provides a billing model which significantly reduces the cost of web services to near zero when not in use. You just pay for the costs to store your container images. When an HTTP request is received at the service URL, the container image is loaded into an execution unit and processing requests resume. Once requests stop, billing and resource usage also stop.

GPU/TPU types of data processing are best delivered by backend instances that protect and manage the processing power and costs that these processor devices provide.

Upvotes: 13

Related Questions