Reputation: 1072
I am using Jupyter on GCP (set up the easy way via the AI Platform) to train a MondrianForestRegressor
from scikit-garden
. My dataset is about 450000 x 300 and training using the machine as-is, even utilising parallelism n_jobs=-1
(32 CPUs, 208GB RAM) is far slower than I would like.
I attached a GPU (2x NVIDIA Tesla T4), restarted the instance and tried again. Training speed seems unaffected by this change.
Upvotes: 0
Views: 580
Reputation: 469
When creating a Notebook it allocates a GCE VM instance and a GPU, to monitor the GPU you should install the GPU metrics reporting agent on each VM instance that has a GPU attached, this will collect GPU data and sends it to StackDriver Monitoring
Additionally, there are two ways to make use of the GPUs:
High-level Estimator API: No code changes are necessary as long as your ClusterSpec is configured properly. If a cluster is a mixture of CPUs and GPUs, map the ps job name to the CPUs and the worker job name to the GPUs.
Core TensorFlow API: You must assign ops to run on GPU-enabled machines. This process is the same as using GPUs with TensorFlow locally. You can use tf.train.replica_device_setter to assign ops to devices.
Also, here is a lecture about when to use GPU instead of CPU and here you can read a lecture about the performance when using GPU over Tree training
Upvotes: 1