MLhacker
MLhacker

Reputation: 1512

Improving speed with Cython when dealing with Numpy ndarray

I'm testing Cython performance when a numpy ndarray is changed based on whether the sum of the i-th and j-th indices are even or odd. Compared to Python, the Cython speed is merely 80% faster, which is somewhat of a mediocre gain in speed. I've ran out of ideas at the moment. Any suggestions?

@Python:

def even_odd_function(matrix):

    dim = matrix.shape[1]

    for i in range(dim):
        for j in range(dim):
            if (i + j) % 2 == 0:
                matrix[i, j] = matrix[i, j] ** 2
            else:
                matrix[i, j] = matrix[i, j] * -1

    return matrix

@Cython:

%%cython

import numpy as np
cimport numpy as np
cimport cython

DTYPE = np.int
ctypedef np.int DTYPE_t

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
def even_odd_function7(np.ndarray matrix):

    cdef int dim = matrix.shape[1]
    cdef int i
    cdef int j

    for i in range(dim):
        for j in range(dim):
            if (i + j) % 2 == 0:
                matrix[i, j] = matrix[i, j] * matrix[i, j]
            else:
                matrix[i, j] = matrix[i, j] * -1

    return matrix

Here are the highlighted lines: enter image description here

Upvotes: 0

Views: 513

Answers (1)

chrisb
chrisb

Reputation: 52286

You need to annotate the type of your array for the major speedup.

import numpy as np
cimport numpy as np
cimport cython

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
def even_odd_function8(np.ndarray[np.float64_t, ndim=2] matrix):

    cdef int dim = matrix.shape[1]
    cdef int i
    cdef int j

    for i in range(dim):
        for j in range(dim):
            if (i + j) % 2 == 0:
                matrix[i, j] = matrix[i, j] * matrix[i, j]
            else:
                matrix[i, j] = matrix[i, j] * -1

    return matrix

In [20]: arr = np.random.randn(1000, 1000)

In [21]: %timeit even_odd_function(arr)
1 loop, best of 3: 636 ms per loop

In [22]: %timeit even_odd_function7(arr)
1 loop, best of 3: 480 ms per loop

In [24]: %timeit even_odd_function8(arr)
1000 loops, best of 3: 1.61 ms per loop

Largely a stylistic thing, but I prefer the newer typed memoryview syntax, which will do the same thing.

def even_odd_function(np.float64_t[:,:] matrix)

Upvotes: 4

Related Questions