Poehe
Poehe

Reputation: 33

Tensorflow Inception batch classification slower at each iteration

I retrained Inception's final layer and retrained it on my own categories using this tutorial from tensorflow.com. I am a beginner with Tensorflow and my goal is to classify 30.000 pictures for a project at work.

After retraining the final layer to my own labels, I grabbed around 20 unseen pics and added them (full file path) to a pandas dataframe. Next, I feed each pic in the dataframe to the image classifier and, after classification, add the corresponding highest prediction label and reliability score to two other columns in the same row.

To feed the pics to the classifier, I used I used df.iterrows(), df.apply(function) and also 3 seperate hardcoded file paths (see code below, I left them commented). However, I found that classifying the pics takes longer each iteration regardless of the way I fed the pics. Pic[0] starts with a classification time of 2.2 seconds, however by Pic[19] this has increased to 23 seconds. Imagine how long it takes at pic 10.000, 20.000, etc. Furthermore, the cpu and memory usages also increase slowly while the files are classified, though they're not dramatic increases.

Please see my code below (the bulk of it, save the pandas and classification activation part, are taken from this example mentioned in the tensorflow tutorial above).

import os
import tensorflow as tf, sys
import pandas as pd
import gc
import numpy as np
import tensorflow as tf
import time
import psutil    


modelFullPath = '/Users/jaap/tf_files/retrained_graph.pb'
labelsFullPath = '/Users/jaap/tf_files/retrained_labels.txt'    

def create_graph():
    """Creates a graph from saved GraphDef file and returns a saver."""
    # Creates graph from saved graph_def.pb.
    with tf.gfile.FastGFile(modelFullPath, 'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        _ = tf.import_graph_def(graph_def, name='')    


def run_inference_on_image(image):
    answer = None
    imagePath = image
    print imagePath
    if not tf.gfile.Exists(imagePath):
        tf.logging.fatal('File does not exist %s', imagePath)
        return answer    

    image_data = tf.gfile.FastGFile(imagePath, 'rb').read()    

    # Creates graph from saved GraphDef.
    create_graph()    

    with tf.Session() as sess:    

        softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
        predictions = sess.run(softmax_tensor,
                               {'DecodeJpeg/contents:0': image_data})
        predictions = np.squeeze(predictions)    

        top_k = predictions.argsort()[-5:][::-1]  # Getting top 5 predictions
        f = open(labelsFullPath, 'rb')
        lines = f.readlines()
        labels = [str(w).replace("\n", "") for w in lines]
        for node_id in top_k:
            human_string = labels[node_id]
            score = predictions[node_id]
            print('%s (score = %.5f)' % (human_string, score))
            return human_string, score    


werkmap = '/Users/jaap/tf_files/test/'
filelist = []
files_in_dir = os.listdir('/Users/jaap/tf_files/test/')
for f in files_in_dir:
    if f != '.DS_Store':
        filelist.append(werkmap+f)    

df = pd.DataFrame(filelist, index=None, columns=['Pics'])
df = df.drop_duplicates()
df['Class'] = ''
df['Reliability'] = ''    

print(df)    


#--------------------------------------------------------
for index, pic in df.iterrows():
    start = time.time()
    df['Class'][index] = run_inference_on_image(pic[0])
    stop = time.time()
    duration = stop - start
    print("duration = %s" % duration)
    print("cpu usage: %s" % psutil.cpu_percent())
    print("memory usage: %s " % psutil.virtual_memory())
    print("")

df['Class'] = df['Class'].astype(str)
df['Class'], df['Reliability'] = df['Class'].str.split(',', 1).str    

#-------------------------------------------------        

# df['Class'] = df['Pics'].apply(run_inference_on_image)
# df['Class'] = df['Class'].astype(str)
# df['Class'], df['Reliability'] = df['Class'].str.split(',', 1).str
# print(df)    

#--------------------------------------------------------------
# start = time.time()
# ja = run_inference_on_image('/Users/jaap/tf_files/test/12345_1.jpg')
# stop = time.time()
# duration = stop - start
# print("duration = %s" % duration)  

# start = time.time()
# ja = run_inference_on_image('/Users/jaap/tf_files/test/12345_2.jpg')
# stop = time.time()
# duration = stop - start
# print("duration = %s" % duration)    

# start = time.time()
# ja = run_inference_on_image('/Users/jaap/tf_files/test/12345_3.jpg')
# stop = time.time()
# duration = stop - start
# print("duration = %s" % duration)    

I appreciate any help!

Upvotes: 0

Views: 498

Answers (1)

user1454804
user1454804

Reputation: 1080

It seems you're creating whole graph for each inference. This should make it slow. Instead you can do following:

with tf.Graph().as_default():
  create_graph()
  with tf.Session() as sess:
    for index, pic in df.iterrows():
      start = time.time()
      df['Class'][index] = run_inference_on_image(pic[0], sess)
      stop = time.time()

Upvotes: 1

Related Questions