Reputation: 32111
Is there a way to force Tensorflow to use a single CPU core instead of distributed CPU cores?
I ask because it's generally true that there are diminishing returns on distributing BLAS functions across multiple CPUs, at least in cases I've experimented with using OpenBLAS and Julia.
I want a hyperparameter search to run over 32 CPU cores, a few hundred model trainings. I expect it would be far more efficient to train 32 models in parallel on individual CPU cores than it would be to train 32 models in series using distributed BLAS (I've demonstrated this on Mocha Framework / Julia, where these kinds of changes are pretty easy to implement).
Upvotes: 2
Views: 653
Reputation: 1637
You should be able to use the regular numctl --physcpubind
and also with tf.device()
.
Upvotes: 2