Asdf11
Asdf11

Reputation: 447

Multithreading in tensorflow/keras

I would like to train some different models with model.fit() parallel in one python application. The used models dont have necessary something in common, they are started in one application at different times.

First I start one model.fit() with no problems in a seperate thread then the main thread. If I now want to start a second model.fit(), I get the following error message:

Exception in thread Thread-1:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Node 'hidden_1/BiasAdd': Unknown input node 'hidden_1/MatMul'

They are both getting started from a method by the same lines of code:

start_learn(self:)
   tf_session = K.get_session()  # this creates a new session since one doesn't exist already.
   tf_graph = tf.get_default_graph()

   keras_learn_thread.Learn(learning_data, model, self.env_cont, tf_session, tf_graph)
   learning_results.start()

Th called class/method looks like this:

def run(self):
    tf_session = self.tf_session  # take that from __init__()
    tf_graph = self.tf_graph  # take that from __init__()

    with tf_session.as_default():
        with tf_graph.as_default():
            self.learn(self.learning_data, self.model, self.env_cont)
            # now my learn method where model.fit() is located is being started

I think I somehow have to assign a new tf_session and a new tf_graph for each single thread. But I am not quite sure about that. I would be glad about every short idea, since I am sitting on this for too long now.

Thanks

Upvotes: 16

Views: 5211

Answers (2)

Grzegorz Krug
Grzegorz Krug

Reputation: 301

Tensorflow 2.x

Assuming you got newer tensorflow (question is old), I used to define sessions like this. This was to prevent one application from occupying whole gpu memory :P

import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = False  # can be true, to increase memory allocation
config.gpu_options.per_process_gpu_memory_fraction = 0.2  # fraction of memory used
sess = tf.compat.v1.Session(config=config) 

But if you want to do it parallel, you should close session at some point. To do it, you can choose which style suits you more. Context manager seems more reasonable

Normal way:

sess = tf.compat.v1.Session(config=config) 
# do stuff here
sess.close()

or with context manager:

with tf.compat.v1.Session(config=config):
    # do stuff here

Keras

This was compatible with one models, I presume it will work with multi session correctly

Session documentaion

Upvotes: 0

Julio Daniel Reyes
Julio Daniel Reyes

Reputation: 6365

I don't know if you fixed your issue but this looks like another question I recently answered.

  • You need to finish the graph creation in the main thread before starting the others.
  • In the case of keras, the graph is initialized the first time the fit or predict function is called. You can force the graph creation by calling some of the inner functions of model:

    model._make_predict_function()
    model._make_test_function()
    model._make_train_function()
    

    If that doesn't work, try to warm-up the model by calling on dummy data.

  • Once you finish the graph creation, call finalize() on your main graph so it can be safely shared it with different threads (that will make it read-only).

  • Finalizing the graph will also help you find other places where your graph is being unintentionaly modified.

Hope that helps you.

Upvotes: 1

Related Questions