Reputation: 19
I would like to get the fastest solution to write data in a 2D numpy array using an array of indexes.
I have a large 2D boolean numpy array buffer
import numpy as np
n_rows = 100000
n_cols = 250
shape_buf = (n_rows, n_cols)
row_indexes = np.arange(n_rows,dtype=np.uint32)
w_idx = np.random.randint(n_cols, size=n_rows, dtype = np.uint32)
buffer = np.full(shape=shape_buf,
fill_value=0,
dtype=np.bool_,order="C")
I want to write data in the buffer using a list of indexes w_idx
data = np.random.randint(0,2, size=n_rows, dtype = np.bool_)
w_idx = np.random.randint(n_cols, size=n_rows, dtype = np.uint32)
One solution is to use standard indexing :
%timeit buffer[row_indexes, w_idx] = data
2.07 ms ± 20.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
A faster solution is to flatten the indexes and to use np.put :
%timeit buffer.put(flat_row_indexes + w_idx, data, "wrap")
1.76 ms ± 18.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
However this last solution is still to slow for my application. Is it possible to be faster ? Maybe by using another library, say Numba ?
Upvotes: 0
Views: 385
Reputation: 114360
The timing result you got makes sense given the fact that the first assignment fills in all 800 rows for each column, while the second one actually places the individual elements you want into the array. The reason the first version appears to be ~100x faster instead of ~800x faster is that the overhead of a call to put
for such a small dataset is going to overwhelm the timing result.
First lesson: always test numpy operations on a small array that you can see. Usually no larger than 5x5, so you can compare to a hand-calculated version if necessary.
Second lesson: benchmarks on small arrays are unreliable. An O(n) algorithm only achieves linear scaling asymptotically. Timing for small arrays is dominated by function calls and other (usually constant-time) bookkeeping, especially in python.
The best thing I can think of is to avoid the overhead of calling put
:
buffer[np.arange(n_rows), w_idx] = data
Another option, since you already have the offsets pre-computed, and all your arrays are contiguous, is to assign linear indices:
buffer.ravel()[w_idx + flat_row_indices] = data
Upvotes: 2