mephist
mephist

Reputation: 11

Numba Invalid use of BoundFunction of array.mean

I want to calculate the mean for the second index for each third index.

@njit
def mean_some_index(a):
    T = a.shape[2]
    b = np.zeros((T,T))
    for t in range(T):
        b[:, t] = a[:,:,t].mean(axis = 1)
    return b

I would use it like

a = np.random.randn(5*5*5).reshape((5,5,5))
mean_some_index(a)

It is okay without Numba; however, the Numba returns an error saying:

resolving callee type: BoundFunction(array.mean for array(float64, 2d, A))
...
File "C:\Users\Mining-Base\AppData\Local\Temp\ipykernel_565300\1191607406.py", line 7:
def mean_some_index(a):
    <source elided>
    for t in range(T):
        b[:, t] = a[:,:,t].mean(axis = 1)

I don't quite understand the error and will appreciate for who answer my question.

Upvotes: 1

Views: 278

Answers (1)

Nathan Furnal
Nathan Furnal

Reputation: 2410

Edit: added the swapped loop version and larger arrays for benchmarking. Also, it seems that numba was very helpful for small arrays (hundreds of elements) but less so for large arrays (millions of elements).

Edit2: added parallel code.


That's because numba doesn't support arguments for those methods, see the docs.

We can still benefit from the speedup by calculating the mean in a loop though.

import numpy as np
from numba import njit, prange

rng = np.random.default_rng()


# arr = rng.standard_normal(size=(5, 5, 5))
arr = rng.standard_normal(size=(500, 500, 500))

def np_mean(arr):
    z_dim = arr.shape[2]
    out = np.empty((z_dim, z_dim))
    for ax in range(z_dim):
        out[:, ax] = arr[:, :, ax].mean(axis=1)
    return out


@njit
def nb_mean(arr):
    y_dim, z_dim = arr.shape[1], arr.shape[2]
    out = np.empty((z_dim, z_dim))
    for ax in range(z_dim):
        for idx in range(y_dim):
            out[idx, ax] = arr[:, :, ax][idx].mean()
    return out

@njit
def nb_mean_swapped(arr):
    y_dim, z_dim = arr.shape[1], arr.shape[2]
    out = np.empty((z_dim, z_dim))
    for idx in range(y_dim):
        for ax in range(z_dim):
            out[idx, ax] = arr[:, :, ax][idx].mean()
    return out

@njit(parallel=True)
def nb_mean_swapped_parallel(arr):
    y_dim, z_dim = arr.shape[1], arr.shape[2]
    out = np.empty((z_dim, z_dim))
    for idx in prange(y_dim):
        for ax in prange(z_dim):
            out[idx, ax] = arr[:, :, ax][idx].mean()
    return out
In [27]: %timeit np_mean(arr)
674 ms ± 23.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [28]: %timeit nb_mean(arr)
606 ms ± 28.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [29]: %timeit nb_mean_swapped(arr)
218 ms ± 23 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [30]: %timeit nb_mean_swapped_parallel(arr)
64.3 ms ± 2.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Upvotes: 1

Related Questions