Stiefel
Stiefel

Reputation: 2793

Why is this numpy array operation so slow?

I am a python beginner and I am trying to average two NumPy 2D arrays with shape of (1024,1024). Doing it like this is quite fast:

newImage = (image1 + image2) / 2

But now the images have a "mask" that invalidate certain elements if set to zero. That means if one of the elements is zero, the resulting element should also be zero. My trivial solution is:

newImage = numpy.zeros( (1024,1024) , dtype=numpy.int16 )

for y in xrange(newImage.shape[0]):
   for x in xrange(newImage.shape[1]):
      val1 = image1[y][x]  
      val2 = image2[y][x]                            
      if val1!=0 and val2!=0:               
         newImage[y][x] = (val1 + val2) / 2

But this is really slow. I did not time it, but it seems to be slower by a factor of 100.

I also tried using a lambda operator and "map", but this does not return a NumPy array.

Upvotes: 0

Views: 1411

Answers (4)

Arijan
Arijan

Reputation: 163

numpy array access operation seems slow at best. I can't see any reason for it. You can clearly see it by constructing a simple example:

    import numpy
    # numpy version
    def at(s,n):
      t1=time.time()
      a=numpy.zeros(s,dtype=numpy.int32)
      for i in range(n):
        a[i%s]=n
      t2=time.time()
      return t2-t1
    # native version
    def an(s,n):
      t1=time.time()
      a=[(i) for i in range(s)]
      for i in range(n):
        a[i%s]=n
      t2=time.time()
      return t2-t1

    # test
    [at(100000,1000000),an(100000,1000000)]

Result: [0.21972250938415527, 0.15950298309326172]

Upvotes: 0

eumiro
eumiro

Reputation: 212825

Try this:

newImage = numpy.where(np.logical_and(image1, image2), (image1 + image2) / 2, 0)

Where none of image1 and image2 equals zero, take their mean, otherwise zero.

Upvotes: 8

pberkes
pberkes

Reputation: 5360

Explicit for loops are very inefficient in Python in general, not only for numpy operations. Fortunately, there are faster ways to solve our problem. If memory is not an issue, this solution is quite good:

import numpy as np
new_image = np.zeros((1024, 1024), dtype=np.int16)
valid = (image1!=0) & (image2!=0)
new_image[valid] = (image1+image2)[valid]

Another solution using masked arrays, which do not create copies of the arrays (they represent views of the original image1/2:

m1 = np.ma.masked_equal(image1, 0)
m2 = np.ma.masked_equal(image2, 0)
new_image = (m1+m2).filled(0)

Update: The first solution seems to be 3 times faster than the second for arrays with about 1000 non-zero entries.

Upvotes: 1

Tom Zych
Tom Zych

Reputation: 13576

Looping with native Python code is generally much slower than using built-in tools that use fast C loops. I'm not familiar with NumPy; can you use map() to do a transformation from your two input arrays to the output? If so, that should be faster.

Upvotes: 2

Related Questions