numpy fftn very inefficient for 2d fft of several images

Question

I wanted to compute the fourier transform of several images. I was therefore benchmarking numpy's fft.fftn against a brute force for loop.

This is the code I used to benchmark the 2 approaches (in a jupyter notebook):

import numpy as np

x = np.random.rand(32, 256, 256)

def iterate_fft(arr):
    k = np.empty_like(arr, dtype=np.complex64)
    for i, a in enumerate(arr):
        k[i] = np.fft.fft2(a)
    return k

k_it = iterate_fft(x)
k_np = np.fft.fftn(x, axes=(1, 2))
np.testing.assert_allclose(k_it.real, k_np.real)
np.testing.assert_allclose(k_it.imag, k_np.imag)

%%timeit
k_it = iterate_fft(x)

Output: 63.6 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%timeit
k_np = np.fft.fftn(x, axes=(1, 2))

Output: 122 ms ± 1.79 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Why is there such a huge difference ?

Zaccharie Ramzi · Accepted Answer

So a person involved in the numpy fft development has answered the deep question on GitHub and it turns out that the slowdown is most likely coming from some multi dimensional array rearrangement used by pocketfft.

It will all be a memory when numpy switches to the scipy 1.4 implementation which can be shown using my benchmark to not have these drawbacks.

numpy fftn very inefficient for 2d fft of several images

Answers (2)

Related Questions