Reputation: 155
I want to capture frames from a video with python and opencv and then classify the captured Mat images with tensorflow. The problem is that i don´t know how to convert de Mat format to a 3D Tensor variable. This is how i am doing now with tensorflow (loading the image from file) :
image_data = tf.gfile.FastGFile(imagePath, 'rb').read()
with tf.Session() as sess:
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor,
{'DecodeJpeg/contents:0': image_data})
I will appreciate any help, thanks in advance
Upvotes: 13
Views: 31473
Reputation: 132
All of these other answers did not work for me. Instead, what I did was perform what ever image operations I needed to do on the image first. Then I would encode the image using cv.imencode():
encodedImage = cv.imencode('.jpg',image)[1]
Then I would convert the encoded image to bytes using tobytes()
:
imageBytes = encodedImage.tobytes()
Finally, I would take the bytes and decode them using tf.image.decode_jpeg()
tensorImage = tf.image.decode_jpeg(imageBytes, 1)
The "1" is just to specify I want the image to decoded as a gray-scale image.
After doing these steps I am able to interact with "tensorImage" as a tensor image data while having the OpenCV operations applied to it. So when predicting, I do as follows:
prediction = model.predict(np.array([tensorImage]))
note: using the other answers may give an image with the right dimensions, which means they are able to be inputted into a model, but I found they tended to change the order of the image data, meaning it won't generate beneficial results.
Upvotes: 0
Reputation: 8180
It looks like you're using the pre-trained and pre-defined Inception model, which has a tensor named DecodeJpeg/contents:0
. If so, this tensor expects a scalar string containing the bytes for a JPEG image.
You have a couple of options, one is to look further down the network for the node where the JPEG is converted to a matrix. I'm not sure what the MAT format is, but this will be a [height, width, colour_depth]
representation. If you can get your image in that format you can replace the DecodeJpeg...
string with the name of the node you want to feed into.
The other option is to simply convert your images to JPEGs and feed them straight in.
Upvotes: 5
Reputation: 1405
With Tensorflow 2.0 and OpenCV 4.2.0, you can convert by this way :
import numpy as np
import tensorflow as tf
import cv2 as cv
width = 32
height = 32
#Load image by OpenCV
img = cv.imread('img.jpg')
#Resize to respect the input_shape
inp = cv.resize(img, (width , height ))
#Convert img to RGB
rgb = cv.cvtColor(inp, cv.COLOR_BGR2RGB)
#Is optional but i recommend (float convertion and convert img to tensor image)
rgb_tensor = tf.convert_to_tensor(rgb, dtype=tf.float32)
#Add dims to rgb_tensor
rgb_tensor = tf.expand_dims(rgb_tensor , 0)
#Now you can use rgb_tensor to predict label for exemple :
#Load pretrain model, made from: https://www.tensorflow.org/tutorials/images/cnn
model = tf.keras.models.load_model('cifar10_model.h5')
#Create probability model
probability_model = tf.keras.Sequential([model,
tf.keras.layers.Softmax()])
#Predict label
predictions = probability_model.predict(rgb_tensor, steps=1)
Upvotes: 10
Reputation: 1
In my case i had to read an image from file, do some processing and then inject into inception to obtain the return from a features layer, called last layer. My solution is short but effective.
img = cv2.imread(file)
... do some processing
img_as_string = cv2.imencode('.jpg', img)[1].tostring()
features = sess.run(last_layer, {'DecodeJpeg/contents:0': img_as_string})
Upvotes: 0
Reputation: 186
Load the OpenCV image using imread, then convert it to a numpy array.
For feeding into inception v3, you need to use the Mult:0 Tensor as entry point, this expects a 4 dimensional Tensor that has the layout: [Batch index,Width,Height,Channel] The last three are perfectly fine from a cv::Mat, the first one just needs to be 0, as you do not want to feed a batch of images, but a single image. The code looks like:
#Loading the file
img2 = cv2.imread(file)
#Format for the Mul:0 Tensor
img2= cv2.resize(img2,dsize=(299,299), interpolation = cv2.INTER_CUBIC)
#Numpy array
np_image_data = np.asarray(img2)
#maybe insert float convertion here - see edit remark!
np_final = np.expand_dims(np_image_data,axis=0)
#now feeding it into the session:
#[... initialization of session and loading of graph etc]
predictions = sess.run(softmax_tensor,
{'Mul:0': np_final})
#fin!
Kind regards,
Chris
Edit: I just noticed, that the inception network wants intensity values normalized as floats to [-0.5,0.5], so please use this code to convert them before building the RGB image:
np_image_data=cv2.normalize(np_image_data.astype('float'), None, -0.5, .5, cv2.NORM_MINMAX)
Upvotes: 15
Reputation: 2190
You should be able to convert the opencv mat format to a numpy array as:
np_image_data = np.asarray(image_data)
Once you have the data as a numpy array you can pass it to tensor flow through a feeding mechanism as in the link that @thesonyman101 referenced:
feed_dict = {some_tf_input:np_image_data}
predictions = sess.run(some_tf_output, feed_dict=feed_dict)
Upvotes: 1