John F
John F

Reputation: 1072

Google Cloud Platform AI Notebook - how to ensure GPU is being used?

I am using Jupyter on GCP (set up the easy way via the AI Platform) to train a MondrianForestRegressor from scikit-garden. My dataset is about 450000 x 300 and training using the machine as-is, even utilising parallelism n_jobs=-1 (32 CPUs, 208GB RAM) is far slower than I would like.

I attached a GPU (2x NVIDIA Tesla T4), restarted the instance and tried again. Training speed seems unaffected by this change.

Upvotes: 0

Views: 580

Answers (1)

ebeltran
ebeltran

Reputation: 469

When creating a Notebook it allocates a GCE VM instance and a GPU, to monitor the GPU you should install the GPU metrics reporting agent on each VM instance that has a GPU attached, this will collect GPU data and sends it to StackDriver Monitoring

Additionally, there are two ways to make use of the GPUs:

  • High-level Estimator API: No code changes are necessary as long as your ClusterSpec is configured properly. If a cluster is a mixture of CPUs and GPUs, map the ps job name to the CPUs and the worker job name to the GPUs.

  • Core TensorFlow API: You must assign ops to run on GPU-enabled machines. This process is the same as using GPUs with TensorFlow locally. You can use tf.train.replica_device_setter to assign ops to devices.

Also, here is a lecture about when to use GPU instead of CPU and here you can read a lecture about the performance when using GPU over Tree training

Upvotes: 1

Related Questions