kauDaOtha
kauDaOtha

Reputation: 1190

How to use OpenCV functions in Keras Lambda Layer?

I am trying to use a function that uses some OpenCV function on the image. But the data I am getting is a tensor and I am not able to convert it into an image.

def image_func(img):
     img=cv2.cvtColor(img,cv2.COLOR_BGR2YUV) 
     img=cv2.resize(img,(200,66))
     return img

model=Sequential()
model.add(Lambda(get_ideal_img,input_shape=(r,c,ch),output_shape=(r,c,ch)))

When I run this snippet it throws an error in the cvtColor function saying that img is not a numpy array. I printed out img and it seemed to be a tensor.

I do not know how to change the tensor to an image and then return the tensor as well. I want the model to have this layer.

If I cannot achieve this with a lambda layer what else can I do?

Upvotes: 9

Views: 3268

Answers (2)

pitfall
pitfall

Reputation: 2621

You confused with the symbolic operation in the Lambda layer with the numerical operation in a python function.

Basically, your custom operation accepts numerical inputs but not symbolic ones. To fix this, what you need is something like py_func in tensorflow

In addition, you have not considered the backpropagation. In short, although this layer is non-parametric and non-learnable, you need to take care of its gradient as well.

import tensorflow as tf
from keras.layers import Input, Conv2D, Lambda
from keras.models import Model
from keras import backend as K
import cv2

def image_func(img):
    img=cv2.cvtColor(img,cv2.COLOR_BGR2YUV) 
    img=cv2.resize(img,(200,66))
    return img.astype('float32')

def image_tensor_func(img4d) :
    results = []
    for img3d in img4d :
        rimg3d = image_func(img3d )
        results.append( np.expand_dims( rimg3d, axis=0 ) )
    return np.concatenate( results, axis = 0 )

class CustomLayer( Layer ) :
    def call( self, xin )  :
        xout = tf.py_func( image_tensor_func, 
                           [xin],
                           'float32',
                           stateful=False,
                           name='cvOpt')
        xout = K.stop_gradient( xout ) # explicitly set no grad
        xout.set_shape( [xin.shape[0], 66, 200, xin.shape[-1]] ) # explicitly set output shape
        return xout
    def compute_output_shape( self, sin ) :
        return ( sin[0], 66, 200, sin[-1] )

x = Input(shape=(None,None,3))
f = CustomLayer(name='custom')(x)
y = Conv2D(1,(1,1), padding='same')(x)

model = Model( inputs=x, outputs=y )
print model.summary()

Now you can test this layer with some dummy data.

a = np.random.randn(2,100,200,3)
b = model.predict(a)
print b.shape

model.compile('sgd',loss='mse')
model.fit(a,b)

Upvotes: 5

parsethis
parsethis

Reputation: 8078

Im going to assume image_func function does what you want (resize) and image. Note that an image is represent by a numpy array. Since you are using the tensorflow backend you are operating over Tensors (this you knew).

The job now is to convert a Tensor to a numpy array. To do that we need to evaluate the Tensor using its evaluate the tensor. But inorder to do that we need a to grab a tensor flow session.

Use the get_session() method of the keras backend module to grab the current tensorflow session.

Here is the docstring for get_session()

def get_session():
    """Returns the TF session to be used by the backend.
    If a default TensorFlow session is available, we will return it.
    Else, we will return the global Keras session.
    If no global Keras session exists at this point:
    we will create a new global session.
    Note that you can manually set the global session
    via `K.set_session(sess)`.
    # Returns
        A TensorFlow session.
    """

So try:

def image_func(img)

    from keras import backend as K

    sess  = K.get_session()
    img = sess.run(img) # now img is a proper numpy array 

    img=cv2.cvtColor(img,cv2.COLOR_BGR2YUV) 
    img=cv2.resize(img,(200,66))
    return img

Note, I haven't been able to test this

EDIT: Just tested this and it won't work (as you noticed). The lambda function needs to return Tensor. Computation flows throw a Tensor so it also needs to be to be smooth in the sense of differentiation.

I see that essentially the lambda is changing the color and resizing the image, why don't you do this in pre-processing step?

Upvotes: -1

Related Questions