Yao Yingjie
Yao Yingjie

Reputation: 468

How to configure Python in a GPU cluster?

I have a GPU cluster with one storage-node and several computing nodes each has 8 GPU. I am configuring the cluster.

One of the task is to configure the python, what we need is several versions of Python and some python packages, and for some packages we may require several versions of it, such as different version of tensorflow.

So the question is how to configure the python and the packages so that it' convenient to use different version of the package I want to use.

I have installed both python2.7 and python3.6 in each computing node and in the storage node. But I think it is a good way if one has a huge amount of computing node to configure. One of the solution is to install python in the share directory of the cluster instead of the default /usr/local path. Anyone has a better way to do this?

What I use now is OpenPBS(Torque) and I am new to HPC.

Thanks a lot.

Upvotes: 0

Views: 592

Answers (2)

cparisot
cparisot

Reputation: 11

You can install Modules software environment in a shared directory accessible on every node. Then it will be easy to load a specific version of python or TensorFlow:

module load lang/Python/3.6.0
module load lib/Tensorflow/1.1.0

Then, if for some packages we may require several versions of it, you can have a look at Python virtualenv that permits to install several version of the same package. To share it on all the nodes, consider to create your virtualenv on a shared mount point.

Upvotes: 1

Fex
Fex

Reputation: 332

You could install each piece of software on the storage node under a certain directory and mount that directory on the compute nodes. Then you don't have to install each software several times.

A common solution to this problem are Environment Modules. You install your software as a module. This means that the software is installed in a certain directory (e.g /opt/modules/python/3.6/) together with a modulefile. When you do module load python/3.6, the modulefile sets environment variables such that Python3.6 is in PATH, PYTHONPATH, etc.

This results in a nice separation of your software stack and also enables you to install newer versions of tensorflow without messing up the environment.

Upvotes: 0

Related Questions