How to run tensorflow GPU container on google compute engines?

Question

I am trying to run a tensorflow container on google compute engines with GPU accelerators.

Tried the command

gcloud compute instances create-with-container job-name \
  --machine-type=n1-standard-4 \
  --accelerator=type=nvidia-tesla-k80 \
  --image-project=deeplearning-platform-release \
  --image-family=common-container \
  --container image gcr/io/my-container \
  --container-arg="--container-arguments=xxxx"

But got warning

WARNING: This container deployment mechanism requires a Container-Optimized OS image in order to work. Select an image from a cos-cloud project (cost-stable, cos-beta, cos-dev image families).

I also tried system images from cos-cloud project, which seems doesn't have CUDA driver because tensorflow logs warning cuInit failed.

Wonder what's the correct way to run a tensorflow container on google compute engines with GPU support?

Ernesto U · Accepted Answer

Have you consider Cloud TPU on GKE?

This page describes how to setup a GKE cluster with GPU

How to run tensorflow GPU container on google compute engines?

Answers (2)

Related Questions