Reputation: 814
I have multiple algorithms I run in loops. Those that contain tensorflow really slow down after multiple iterations.
Each file list will be roughly 10,000 files depending on which algorithm it is. I loop through the file list one file at a time, creating a data frame from each file, running my algorithm on the data frame then writing the result to a database. Looks something like:
file_list = self.get_files()
for file in file_list:
data = self.get_data(file.fileid)
result = self.get_result(data)
self.write_result
get_result
is a different function for different algorithms. They normally take 0 - 5 seconds to calculate the results per file.
I'm working with an algorithm at the minute that at the beginning of the loop processes 2 files per second, but after a few hundred files it slows down to a minute per file. Inspecting the code it has to be TF that is the bottleneck as the rest of the code is relatively trivial.
In get_result
there is the following line that I believe is the culprit:
z = self.evaluate_risk(normalized_X)
def evaluate_risk(self, X):
with tf.device('/cpu:0'):
with tf.Session() as sess:
tf.saved_model.loader.load(sess, model.pb)
graph = tf.get_default_graph()
input_x = graph.get_tensor_by_name("input:0")
risk = graph.get_tensor_by_name("risk:0")
z = sess.run(risk, {input_x: X})
sess.close()
del sess
del graph
return z
Given that I'm using with
I don't understand why this function is causing any issues. I have since added sess.close()
, del sess
and del graph
but I still get the same issue.
Each time I have a new file and get to result
I should be starting tensorflow from fresh. Are there any obvious reasons my loop slows down? I'm guessing some part of tensorflow isn't resetting.
Upvotes: 1
Views: 848
Reputation: 59681
Without seeing a complete example it is hard to tell what is the best solution, but generally I would load the model only once (maybe in a graph of its own) and create only one session, then use that in evaluate_risk
. That should reduce significantly the overhead of each call. You could do something like this:
def __init__(self):
# ... init code
self.graph = tf.Graph() # Have the model live in its own graph
with self.graph.as_default(), tf.device('/cpu:0'):
self.session = tf.Session()
tf.saved_model.loader.load(self.session, model_pb)
self.input_x = self.graph.get_tensor_by_name("input:0")
self.risk = self.graph.get_tensor_by_name("risk:0")
def __del__(self):
# Ensure the session is closed when the object is deleted
# (or do it in another method, or make the object work as a context manager, ...)
self.session.close()
def evaluate_risk(self, X):
return self.session.run(self.risk, {self.input_x: X})
EDIT: Closing the session in the __del__
method may be superfluous, as in principle when the object is deleted its session will be too, and thus closed. However, it avoids the potential issue of someone grabbing a reference to the session in the object (like obj_session = my_object.session
), which could result in the session not being closed as expected. It also makes clearer when the session is expected to be closed.
Upvotes: 1