Anatoly Vasilyev
Anatoly Vasilyev

Reputation: 411

Theano/Pylearn2. How to parallelize training?

I have Convolutional Neural Network model described in YAML. When I run pylearn2's train.py, I see that only one core of four is used.

Is there a way to run training multi-threaded?

Yeah, may be it's rather a Theano question. I followed this http://deeplearning.net/software/theano/tutorial/multi_cores.html Theano tutorial about multi cores support, and OMP_NUM_THREADS=2 python theano/misc/check_blas.py -q doesn't work for me - I see only one thread running. And further question: can training be parallelized with OMP_NUM_THREADS? Because I can't check it since OMP_NUM_THREADS doesn't do the thing. Should I bother about my BLAS then?

I have BLAS with LAPACK, numpy connected to them, python 2.7.9, my system is Ubuntu 15.04 on Core i5 4300U.

Thank you, warm wishes!

Upvotes: 2

Views: 1794

Answers (1)

Daniel Renshaw
Daniel Renshaw

Reputation: 34177

The most direct answer to your question is: you can't parallelize training in the way you desire.

BLAS, OpenMP, and/or running on a GPU only allow certain operations to be parallelized. The training itself can only be parallelized, in the way you want, if the training algorithm is designed to be parallelized. By default PyLearn2 uses the ordinary stochastic gradient descent (SGD) training algorithm which is not parallelizable. There are version of SGD that support parallelization (e.g. Google's DistBelief) but these are not available in PyLearn2 off-the-shelf. This is mostly because PyLearn2 is built on top of Theano and Theano is very much designed for shared memory operations.

If you have a GPU then you'll almost certainly get faster training by switching to the GPU. If that isn't an option you should see more than one core being used some of time as long as your BLAS and OpenMP are set up correctly. The fact that check_blas.py doesn't show any improvement when OMP_NUM_THREADS > 2 suggests you don't have them set up correctly. I suggest opening a new question if you need help with this, providing more information about what you've done, and the settings shown by numpy when you print its config (see here for example).

Upvotes: 2

Related Questions