Google Cloud Platform AI Notebook - how to ensure GPU is being used?

Question

I am using Jupyter on GCP (set up the easy way via the AI Platform) to train a MondrianForestRegressor from scikit-garden. My dataset is about 450000 x 300 and training using the machine as-is, even utilising parallelism n_jobs=-1 (32 CPUs, 208GB RAM) is far slower than I would like.

I attached a GPU (2x NVIDIA Tesla T4), restarted the instance and tried again. Training speed seems unaffected by this change.

Is there something I need to do when training the model in Jupyter to make sure that the GPUs are actually being used?
Are GPUs even useful for tree-based methods? There is literature which would suggest that they are (https://link.springer.com/chapter/10.1007/978-3-540-88693-8_44), but I don't fully understand the intricacies of what makes a GPU more suitable for different types of algorithms beyond the fact that they deal well with giant matrix calculations e.g. for deep learning.

ebeltran · Accepted Answer

When creating a Notebook it allocates a GCE VM instance and a GPU, to monitor the GPU you should install the GPU metrics reporting agent on each VM instance that has a GPU attached, this will collect GPU data and sends it to StackDriver Monitoring

Additionally, there are two ways to make use of the GPUs:

High-level Estimator API: No code changes are necessary as long as your ClusterSpec is configured properly. If a cluster is a mixture of CPUs and GPUs, map the ps job name to the CPUs and the worker job name to the GPUs.
Core TensorFlow API: You must assign ops to run on GPU-enabled machines. This process is the same as using GPUs with TensorFlow locally. You can use tf.train.replica_device_setter to assign ops to devices.

Also, here is a lecture about when to use GPU instead of CPU and here you can read a lecture about the performance when using GPU over Tree training

Google Cloud Platform AI Notebook - how to ensure GPU is being used?

Answers (1)

Related Questions