Gustavo Kaneto
Gustavo Kaneto

Reputation: 683

OpenCV and Python: arithmetic operations between scalar and all pixels and channels

I'm trying to make operations - in Python - on color images (3 channels), like adding, multiplying, etc... Using, for example, cv.add(img, value), where img is a 3-channel image and value is a scalar.

But the functions are only changing the first channel. I found that in C++ you must use Scalar(value1, value2, value3) to apply operations to all channels.

How can I do that in Python? Is there a way to pass those 3 scalar values to the function at the same time, so that I don't need to use loops?

Edit: also, I think it's preferable to use openCV functions, because they have the advantage of being "saturated operations". When dealing with uint8, for example, using cv.add(250+10) will return 255, and not 260. And using numpy, 250 + 10 = 260 % 256 = 4.

Example code and error

I created a 4x4 pixels image, 3 channels, and then tried to add a scalar.

import cv2 as cv
import numpy as np

img = np.zeros((4,4,3), np.uint8)
print(img)
cv.add(img, 2)

And the results are:

array([[[2, 0, 0],
        [2, 0, 0],
        [2, 0, 0],
        [2, 0, 0]],

       [[2, 0, 0],
        [2, 0, 0],
        [2, 0, 0],
        [2, 0, 0]],

       [[2, 0, 0],
        [2, 0, 0],
        [2, 0, 0],
        [2, 0, 0]],

       [[2, 0, 0],
        [2, 0, 0],
        [2, 0, 0],
        [2, 0, 0]]], dtype=uint8)

But if I use just one pixel, or one row, or one column, the results are right:

a = img[1,1]; print(a)
cv.add(a, 2)

a = img[:,1]; print(a)
cv.add(a, 2)

a = img[1,:,]; print(a)
cv.add(a, 2)

Results for the last of three examples above:

In [341]: a = img[1,:,]; print(a)
[[0 0 0]
 [0 0 0]
 [0 0 0]
 [0 0 0]]

In [342]: cv.add(a, 2)
Out[342]: 
array([[2, 2, 2],
       [2, 2, 2],
       [2, 2, 2],
       [2, 2, 2]], dtype=uint8)

Why use opencv functions, instead of numpy?

First, I think it's just strange that you can't do this directly with opencv functions. :P (Also, the function works for 1 column or 1 row of pixels; that makes me think that there is a simple solution to make it works in opencv-python.)

Second, performance seems to be very different. Absolute time is not that large, but if you, for example, need to do some heavy processing in a live video, that might make a difference.

I've run some simple cv.add() vs. numpy, adding a scalar to a one-channel image:

img = np.zeros((500,500,1), np.uint8)

# sum without saturation
%timeit res1 = img + 100 
%timeit res2 = cv.add(img, 100)

#sum with saturation (going over 255)
%timeit res1 = img + 300
%timeit res2 = cv.add(img, 300)

And the performance results are:

In [56]: %timeit res1 = img + 100
688 µs ± 19.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [57]: %timeit res2 = cv.add(img, 100)
129 µs ± 9.96 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [58]: %timeit res1 = img + 300
1.41 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [59]: %timeit res2 = cv.add(img, 300)
736 µs ± 9.04 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

On operations without saturation, numpy addition is about 5 times slower than opencv. With saturation, it's about 2 times slower. But you still have to correct the numpy results, so that it shows a saturated 255; and eventually you'll have to convert it back to uint8 (numpy converted the result to uint16 to accommodate the results):

%timeit res1 = img + 300; res1[res1 > 255] = 255
2.89 ms ± 67.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit res1 = img + 300; res1[res1 > 255] = 255; res1 = np.uint8(res1)
3.79 ms ± 300 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

So, the complete operation is again about 5 times slower when using numpy...

Upvotes: 3

Views: 4130

Answers (1)

Dan Mašek
Dan Mašek

Reputation: 19041

In the Python OpenCV bindings, if you want to pass something to a OpenCV function that should be interpreted as a scalar, you should use a tuple with 4 elements. The size is important, that's what allows the wrapper code to recognise it as such. This corresponds to the C++ type cv::Scalar, which also holds 4 values. Only the values that are needed are used (corresponding to the channel depth of the other operand), the rest is ignored.

Example:

import cv2
import numpy as np
img = np.ones((4,4,3), np.uint8)
print cv2.add(img, (1,2,255,0))

Console output:

[[[  2   3 255]
  [  2   3 255]
  [  2   3 255]
  [  2   3 255]]

 [[  2   3 255]
  [  2   3 255]
  [  2   3 255]
  [  2   3 255]]

 [[  2   3 255]
  [  2   3 255]
  [  2   3 255]
  [  2   3 255]]

 [[  2   3 255]
  [  2   3 255]
  [  2   3 255]
  [  2   3 255]]]

Upvotes: 2

Related Questions