Why do the following codes take different times for execution (What is actually going on inside numpy arrays)?

Question

This is the one implemented using loops.

import cv2 as c
import numpy as np
img=c.imread("E:\one.jpg")
row,column,_=img.shape
arr=np.zeros((row,column), dtype=np.int8)
for i in range(0,row-1):
    for j in range(0,column-1):
        arr[i,j]=.114*img[i,j,0]+.587*img[i,j,1]+.299*img[i,j,2]
c.imshow("one",arr)
c.waitKey(0)
c.destroyAllWindows()

This one using full array operation

import cv2 as c
import numpy as np
img=c.imread("E:\one.jpg")
row,column,_=img.shape
arr=np.zeros((row,column), dtype=np.int8)
img[:,:,0]=img[:,:,0]*.114
img[:,:,1]=img[:,:,1]*.587
img[:,:,2]=img[:,:,2]*.299
arr=img[:,:,0]+img[:,:,1]+img[:,:,2]
c.imshow("one",arr)
c.waitKey(0)
c.destroyAllWindows()

Why is the second one working faster compared to the first i.e. what makes the second one work more faster than the first?

It will be very helpful if I get to know the logic used in implementing the full array operations.

hpaulj · Accepted Answer

It is doing roughly the same calculations in the 2 cases, but in the 2nd, most of the iteration takes place in compiled code.

try this:

 w = np.array([.114,.587, .299])
 arr = np.sum(img*w[None,None,:], axis=2)

(I think I got the dimensions right, I'll test it shortly).

Or even

 arr = np.dot(img,w)

testing:

In [259]: img=np.ones((5,5,3),int)
In [260]: img[:,1:-1,1]=2; img[1:-1,:,2]=3
In [261]: w = np.array([.114,.587, .299])
In [262]: arr = np.sum(img*w[None,None,:], axis=2)
In [263]: arr
Out[263]: 
array([[ 1.   ,  1.587,  1.587,  1.587,  1.   ],
       [ 1.598,  2.185,  2.185,  2.185,  1.598],
       [ 1.598,  2.185,  2.185,  2.185,  1.598],
       [ 1.598,  2.185,  2.185,  2.185,  1.598],
       [ 1.   ,  1.587,  1.587,  1.587,  1.   ]])
In [264]: np.dot(img,w)
Out[264]: 
array([[ 1.   ,  1.587,  1.587,  1.587,  1.   ],
       [ 1.598,  2.185,  2.185,  2.185,  1.598],
       [ 1.598,  2.185,  2.185,  2.185,  1.598],
       [ 1.598,  2.185,  2.185,  2.185,  1.598],
       [ 1.   ,  1.587,  1.587,  1.587,  1.   ]])

your second version:

# arr=np.zeros((row,column), dtype=np.int8)  # no use
img1 = np.zeros(img.shape, dtype=float)  # to allow float multiplication
img1[:,:,0]=img[:,:,0]*.114  # a fast i,j iteration
img1[:,:,1]=img[:,:,1]*.587  # another fast i,j
img1[:,:,2]=img[:,:,2]*.299   # yet another
arr=img1[:,:,0]+img1[:,:,1]+img1[:,:,2]  # and yet another :)

This iterates over the full i,j range of the 1st two dimensions multiple times, but it does so in compiled code - so it is much faster than your explicit iteration in Python.

your first example is off by one in its iteration:

In [244]: arr=np.zeros((row,column), dtype=float)
In [245]: for i in range(0,row-1):
        for j in range(0,column-1):
                arr[i,j]=.114*img[i,j,0]+.587*img[i,j,1]+.299*img[i,j,2]
   .....:         
In [246]: arr
Out[246]: 
array([[ 1.185,  1.185,  1.185,  1.185,  0.   ],
       [ 1.185,  1.185,  1.185,  1.185,  0.   ],
       [ 1.185,  1.185,  1.185,  1.185,  0.   ],
       [ 1.185,  1.185,  1.185,  1.185,  0.   ],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ]])

Why do the following codes take different times for execution (What is actually going on inside numpy arrays)?

Answers (2)

Related Questions