CV_passionate
CV_passionate

Reputation: 135

Tensorflow prediction on large number of images is slow

I am following this tutorial for image classification in Tensorflow: http://cv-tricks.com/tensorflow-tutorial/training-convolutional-neural-network-for-image-classification/. The training and test on a single image works fine. However my code for prediction on a large number of images is very slow and it consumes 100% of CPU and almost max out memory as well! For 2700 images it takes more than 24 hours! It is not practical. Is there a way to do batch testing the same way we did batch training? Please note I need to do normalization on the images as well. Here is my code:

import tensorflow as tf
import numpy as np
import os,glob,cv2
import sys,argparse


# First, pass the path of the image
os.chdir("/somepath")

i = 0
files = glob.glob('*.jpg')
files.extend(glob.glob('*.JPG'))
totalNumber = len(files)
print("total number of images is:", totalNumber)

image_size=128
num_channels=3
text_file = open("Results.txt", "w")

for file in files:
    images = []
    filename = file
    print(filename)
    text_file.write("\n")
    text_file.write(filename)
    # Reading the image using OpenCV
    image = cv2.imread(filename)
    # Resizing the image to our desired size and preprocessing will be done exactly as done during training
    image = cv2.resize(image, (image_size, image_size),0,0, cv2.INTER_LINEAR)
    images.append(image)
    images = np.array(images, dtype=np.uint8)
    images = images.astype('float32')
    images = np.multiply(images, 1.0/255.0) 
    #The input to the network is of shape [None image_size image_size num_channels]. Hence we reshape.
    x_batch = images.reshape(1, image_size,image_size,num_channels)

    ## Let us restore the saved model 
    sess = tf.Session()
    # Step-1: Recreate the network graph. At this step only graph is created.
    saver = tf.train.import_meta_graph('pathtomymeta/my_model-9909.meta')
    # Step-2: Now let's load the weights saved using the restore method.
    saver.restore(sess, tf.train.latest_checkpoint('pathtomycheckpoints/checkpoints/'))

    # Accessing the default graph which we have restored
    graph = tf.get_default_graph()

    # Now, let's get hold of the op that we can be processed to get the output.
    # In the original network y_pred is the tensor that is the prediction of the network
    y_pred = graph.get_tensor_by_name("y_pred:0")

    ## Let's feed the images to the input placeholders
    x = graph.get_tensor_by_name("x:0") 
    y_true = graph.get_tensor_by_name("y_true:0") 
    y_test_images = np.zeros((1, 3)) #np.zeros((1, 2)) 


    ### Creating the feed_dict that is required to be fed to calculate y_pred 
    feed_dict_testing = {x: x_batch, y_true: y_test_images}
    result = sess.run(y_pred, feed_dict=feed_dict_testing)
    # result is of this format [probabiliy_of_rose probability_of_sunflower]
    print(result)
    text_file.write("\n")
    text_file.write('%s' % result[i,0])
    text_file.write("\t")
    text_file.write('%s' % result[i,1])
    text_file.write("\t")
    text_file.write('%s' % result[i,2])

text_file.close()

Upvotes: 0

Views: 2460

Answers (1)

Dr. Snoopy
Dr. Snoopy

Reputation: 56347

I think you should consider a very obvious "optimization" in your code. You are doing a for loop, and on each iteration, you are loading an image, and also loading the model, building the graph, and then making a prediction.

But loading the model and building the graph do not actually depend on the for loop, or any variables inside it (like the input image). It is possible that most of the time in your for loop is spent loading the model, and not doing the actual prediction. You can use a profiler to find out.

So I propose that you just load the model and build the graph once, before the for loop, and then use the following two lines inside your for loop:

feed_dict_testing = {x: x_batch, y_true: y_test_images}
result = sess.run(y_pred, feed_dict=feed_dict_testing)

It should be considerably faster. It could still be slow, but then that is because evaluating a large neural network on a CPU is by itself slow.

Upvotes: 3

Related Questions