bhola prasad
bhola prasad

Reputation: 725

How to permanently install Rapids on Google colab?

Is there a way to install Rapids permanently on Google colab? I tried many solutions given on StackOverflow and other websites but nothing is working. This is a very big library and it is very frustrating to download this every time I want to work on colab.

I tried this code from Rapids but it is also not working. When I close colab and start again later, I get ModuleNotFoundError: No module named 'cudf'.

# Install RAPIDS
!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!bash rapidsai-csp-utils/colab/rapids-colab.sh stable

import sys, os, shutil

sys.path.append('/usr/local/lib/python3.7/site-packages/')
os.environ['NUMBAPRO_NVVM'] = '/usr/local/cuda/nvvm/lib64/libnvvm.so'
os.environ['NUMBAPRO_LIBDEVICE'] = '/usr/local/cuda/nvvm/libdevice/'
os.environ["CONDA_PREFIX"] = "/usr/local"
for so in ['cudf', 'rmm', 'nccl', 'cuml', 'cugraph', 'xgboost', 'cuspatial']:
  fn = 'lib'+so+'.so'
  source_fn = '/usr/local/lib/'+fn
  dest_fn = '/usr/lib/'+fn
  if os.path.exists(source_fn):
    print(f'Copying {source_fn} to {dest_fn}')
    shutil.copyfile(source_fn, dest_fn)
# fix for BlazingSQL import issue
# ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /usr/local/lib/python3.7/site-packages/../../libblazingsql-engine.so)
if not os.path.exists('/usr/lib64'):
    os.makedirs('/usr/lib64')
for so_file in os.listdir('/usr/local/lib'):
  if 'libstdc' in so_file:
    shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib64/'+so_file)
    shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib/x86_64-linux-gnu/'+so_file)
  

A solution has been suggested but which uses pip to install libraries - How do I install a library permanently in Colab? but Rapids can't be installed using pip. It can only be installed using Conda. This is the code to install it.

conda create -n rapids-0.19 -c rapidsai -c nvidia -c conda-forge \
    rapids-blazing=0.19 python=3.7 cudatoolkit=11.0

I tried to include the google drive path(nb_path) to this code using the --prefix flag as suggested by the above link !pip install --target=$nb_path jdc but I am getting a syntax error.

Can anyone tell me how to set this nb_path to the conda create code above?

Upvotes: 2

Views: 1550

Answers (1)

TaureanDyerNV
TaureanDyerNV

Reputation: 1291

5/14/24 update:

cuDF and cudf.pandas now comes preinstalled on Google Colab's GPU instances. You can still pip install the rest of the RAPIDS libraries onto your GPU enabled Colab instance, like cuML, cuGraph, cuXfilter, and cuSpatial, using the RAPIDS+Colab Pip Installation Template

Previously:

For reference, the conda target path for RAPIDS install is /usr/local. We use a different location in the RAPIDS-Colab install script to get it to work.

At the moment, I'm not aware of any way for a user to permanently install RAPIDS into Google Colab. Google Colab isn't designed for the purpose of persisting libraries - or any data for that matter- that aren't preinstalled in the environment. While you have a decent looking workaround there for pip libraries and datasets with Google Drive mounting, with RAPIDS, it is a little more tricky as we update quite a bit of the Colab environment in order to get it to even install RAPIDS. What you propose is an interesting path to explore. We do encourage and work with RAPIDS community members in our Slack channel who try new methods and improve some of our community code like the RAPIDS-Colab installation script.

Just remember, the RAPIDS + Google Colab effort was never meant to be more than a fun, easy way to "Try RAPIDS out". For Google Cloud users, GCP is supposed to be the next step. While it's heartening to see the usage grow over time, Google would need to create a Colab instance that has RAPIDS preinstalled for what you want to happen. You should let the know you want this by

  1. Open any Colab notebook
  2. Go to the Help menu and select ”Send feedback...”

In the meantime, if you need a ready-to-go instance, there are some inexpensive, RAPIDS-enabled, quick start options on the horizon.

Upvotes: 4

Related Questions