Reputation: 861
I'm using Keras with Tensorflow backend on a cluster (creating neural networks). How can I run it in a multi-threaded way on the cluster (on several cores) or is this done automatically by Keras? For example in Java one can create several threads, each thread running on a core.
If possible, how many cores should be used?
Upvotes: 22
Views: 33289
Reputation: 51
For Tensorflow 1.x, you can configure session of Tensorflow and use this session for keras backend:
session_conf = tensorflow.ConfigProto(intra_op_parallelism_threads=8, inter_op_parallelism_threads=8)
tensorflow.set_random_seed(1)
sess = tensorflow.Session(graph=tensorflow.get_default_graph(), config=session_conf)
keras.backend.set_session(sess)
For Tensorflow 2.x, most of the modules above are deprecated. So you need to call them, for example, like this tensorflow.compat.v1.ConfigProto
.
Upvotes: 5
Reputation: 639
Tensorflow automatically runs the computations on as many cores as are available on a single machine.
If you have a distributed cluster, be sure you follow the instructions at https://www.tensorflow.org/how_tos/distributed/ to configure the cluster. (e.g. create the tf.ClusterSpec correctly, etc.)
To help debug, you can use the log_device_placement
configuration options on the session to have Tensorflow print out where the computations are actually placed. (Note: this works for both GPUs as well as distributed Tensorflow.)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
Note that while Tensorflow's computation placement algorithm works fine for small computational graphs, you might be able to get better performance on large computational graphs by manually placing the computations in specific devices. (e.g. using with tf.device(...):
blocks.)
Upvotes: 18